Hot take, `generic` as a type is a crutch most tooling uses out of laziness and has significantly reduced the usefulness of PURL spec. How do we improve this?
And based on that approach we can either:
1. create new, sensible types as needed
2. and/or maintain a last resort open registry of generic types at least so we get some sanity in the process.
isn't the issue that sometimes a given scanner can't know from where the package is sourced?
like if I'm scanning an arbitrary linux system, and I see `libssl.so.1` but I don't see it in the local package manager, I don't really have an option other than to call it generic.
I do agree that "generic" seems to be WAY overused though. Maybe tools that report on SBOMs, like FOSSA or whatever, should emit warnings to users about "generic" PURLs.
> isn't the issue that sometimes a given scanner can't know from where the package is sourced?
That's the problem: there is no metadata with or in libssl.so.1 that I can reliably use to tell what this is
Eventually I can see a solution made of
1. create the metadata, say a simple YAMl or deb822 key-valud pair file that can then be included upstream or as an overlay
2. define a simple spec for binary formats to include a PURL (say in an ELF section or a WinPE string or sorts, where many of these are already stored)
3. create content-based tools like we have in PurlDB to match code, but may be more like a bunch of generated yara rules that would match symbols and strings from source to binaries and can recognize that libssl.so.1 is from OpenSSL 1.1.1g.
Thats fair. It just seems silly that a spec intended to "uniquely ID a package" supports a type that is the complete opposite of "unique". I guess another way to frame my take is should `generic` be consider a valid PURL? Keep it as a fall back sure, but distinguish between "fully qualified" PURLs and "partial" PURLs.
This then gives tooling a path to prompt users to provide missing context needed to fully qualify the PURL
All abstractions leak eventually, so we need that escape hatch IMHO. Otherwise you end up with the other issue which is that there are stuff you cannot track with PURL?
pombreda|9 months ago
Eventually, let's fix this first for C/C++:
https://github.com/aboutcode-org/www.aboutcode.org/issues/30
And based on that approach we can either: 1. create new, sensible types as needed 2. and/or maintain a last resort open registry of generic types at least so we get some sanity in the process.
jessoteric|9 months ago
like if I'm scanning an arbitrary linux system, and I see `libssl.so.1` but I don't see it in the local package manager, I don't really have an option other than to call it generic.
I do agree that "generic" seems to be WAY overused though. Maybe tools that report on SBOMs, like FOSSA or whatever, should emit warnings to users about "generic" PURLs.
pombreda|9 months ago
That's the problem: there is no metadata with or in libssl.so.1 that I can reliably use to tell what this is
Eventually I can see a solution made of
1. create the metadata, say a simple YAMl or deb822 key-valud pair file that can then be included upstream or as an overlay 2. define a simple spec for binary formats to include a PURL (say in an ELF section or a WinPE string or sorts, where many of these are already stored) 3. create content-based tools like we have in PurlDB to match code, but may be more like a bunch of generated yara rules that would match symbols and strings from source to binaries and can recognize that libssl.so.1 is from OpenSSL 1.1.1g.
donenext|9 months ago
This then gives tooling a path to prompt users to provide missing context needed to fully qualify the PURL
donenext|9 months ago
pombreda|9 months ago