I imagine that anyone in our business who has any interest in malware matters is aware of the WildList Organization International, or at any rate of the WildList itself. The WildList is a list of viruses (not, generally speaking, non-replicative malware) known to be spreading "in the wild" (ItW, in AV jargon). How do we know it's spreading (or has been fairly recently)? Because to make the main list, it has to have been reported and a sample submitted by at least two WildList reporters (mostly representatives of anti-malware vendors, and invariably with proven expertise in the field). In fact, the WildList is the tip of a large iceberg, or rather a sizeable set of validated virus samples. This collection is called WildCore, and its advantage over most of the (often larger) sample sets available on vx (Virus eXchange) sites and elsewhere is that it consists of samples that have been replicated and validated by the reporters who sent them in, and again before they're added to the collection.
Recently, though, the WildList has come increasingly under fire. Some of the major problems with it in general are:
- as it focuses on replicative malware, it doesn't really map to the wider threat landscape very well, where non-replicative malware such as bots now represent a greater threat than old-school viruses and worms.
- the list is always behind the curve: for example, we're well into June, and the latest available WildList is the one for March. That reflects the exhaustive validation process, of course, and the time it takes to crossmatch samples from different sources. Even without these processes, though, the WildList has never included all the malware that ever has been (or, more to the point, is currently) ItW, even when virtually all significant malware was viral.
The WildList Organization is aware of these problems, of course, though it's proving painfully slow at addressing them. Nevertheless, many of the better anti-malware testing and/or certification groups find WildCore useful as a baseline. Virus Bulletin's VB100 test, for example, requires a qualifying product to demonstrate "100% detection of malware listed as 'In the Wild' by the WildList Organization" (using default settings, on-access and on demand), and it mustn't generate false positives on a separate sample set of clean files. That doesn't really qualify as "all malware known to be in the wild" because everyone (vendors, other testers and so on" know of other malware that is definitely "out there" but not formally "ItW" because it isn't currently on the WildList. To quote my colleague Randy Abrams, "The Wildlist provides a scientifically sound test set. The Wildlist does not provide a statistically significant test set." So what use is it for testing, really?
- A smaller set of reliably validated samples may tell you more about a product's detection capabilities than a larger set of less reliable samples. In the anti-malware business, we're all too aware of tests that are based at least in part on invalid samples (this is one of the drivers behind the Anti-Malware Testing Standards Organization, about which I'll probably have more to say another day)
- Detecting the whole test set demonstrates that a product team is keeping up with a still significant (non-statistically speaking) subset of (fairly) current threats.
- It also demonstrates a likelihood that the vendor is a part of the WildList "community" (otherwise they'd be unlikely to have seen and processed every sample present in WildCore.) That membership may not be a critical indicator, but it does demonstrate a proven level of expertise and acceptance among the wider research community.
- From a more negative standpoint, consistent failure to meet the certification standard (anyone can have an off day!) may suggest a problem with the product. After all, all the vendors represented in the organization receive the test set, so should, in principle, achieve a reasonable percentage of passes unless the testing is really inappropriate to their detection methodology.
So, would I select a product on the basis of its performance in VB100 or other WildList-oriented tests and certifications like ICSAlabs certification or West Coast Labs Checkmark? Probably not: I'd be more likely to compare performance across a whole range of detection tests, and even then I'd regard them as suggestive, not as wholly reliable indicators. However, poor long-term performance (or refusal to participate in such testing) is something I'd certainly take into account, even though it wouldn't necessarily be a show-stopper.
Disclaimer: I work for an anti-malware vendor (one that generally does well in VB100 tests!) and have a long association with the WildList Organization, Virus Bulletin, and other players in this arena. Of course, this has a lot to do with my interest in (and, hopefully, knowledge of) the area, and I'd like to think that I can comment on these issues impartially. Nonetheless, you should be aware of those affiliations.