Malware

Usually, malware is created as follows: a hacker writes the code, tests it on his OS, and then searches for ways to bypass the antivirus protection. First antiviruses detected malware by the MD5-hash, but hash changes if you change at least a symbol in the code. So, more complex methods were developed, like detecting viruses by particular pieces of the code or the file's structure.

Also, viruses usually have a particular course of action: for example, most try opening another executable and writing in something of their own or meddling with another process’s memory. Analyzing the algorithm of a program's behavior for such obviously malicious actions is called the heuristic approach. Yet, there remains a question: which kind of behavior do we see as legitimate, and which — not, as there are many "normal" programs that act like viruses. Also, there are many ways to bypass such tests — "upgrading" the malware to do that takes only about a couple of hours.

Antiviruses

As for antivirus development, it goes like this: first, new malware appears which can bypass the antivirus protection popular among its "targets". Such program will hack the computer even if it is protected, as it was already tested on similar configurations, and users won't have any protection from this attack for a certain period of time. At first, the malware is yet to be detected; then, as the vendor releases the corresponding upgrade, there will be some time until the user installs it. Even if the vendor can release upgrades really fast, there's no way to decrease the time spent before the virus is detected — this calls for developing new methods of program behavior analysis.

ITMO University. Virus in a code

One method is to register the so-called "slices" - snapshots of the operating system that provide information about its statics and dynamics. For example, you turn on your computer and run Word. The program consumes RAM, EMS, CPU time, etc. There are hundreds of counters that reflect the behavior of a process. One can register slices every several hours to trace malware; yet, even this method can be easily fooled by making the malware falsify its data so as to look "legitimate".

Our CODA project is based on a similar, but more reliable method: we register the program's requests to the system core. As opposed to data from counters, one cannot falsify it. We use it to create a model that allows to distinguish different processes — even two similar programs written in different languages act differently in regard to these requests. Simply put, if slices are good for analyzing the consequences, we study the programs in action. If the program's indications exceed the boundaries of its model, we can "freeze" it, or forbid it to perform some actions — for example, make particular queries.

If a legitimate program gets an upgrade, the change in its behavior is insignificant: few would completely rewrite the program when making an upgrade. Yet, to be sure, we collect all models of program behavior from user PCs in a common server we call "the Oracle". For example, if a user installs a new version of a browser or adds an extension, and the browser's request model changes, CODA uses data from the server to augment the model and then sends it back.

Antivirus. Credit: anti-malware.ru

We’ve proved our systems efficiency in May, 2016, when we launched 60,000 different malware in a network of virtual machines. The detection level was more than 90% at no more than 4% of false responses. And that was after half an hour of learning on a nominally "healthy" system, without using any ready databases.

How it all came to be

In 2010, I graduated from the Faculty of Mathematics and Mechanics of Ural Federal University. Yekaterinburg is the place from where the CTF (Capture the Flag) movement originated — team competitions in the field of computer security, where every team controls one server (studies it and tries to improve its security) and tries to hack the other's. We were the first ones to participate; soon, teams from other universities joined the contest. It was then when I got really interested in computer security.

After Ural Federal University, I went to the Faculty of Mathematics and Mechanics of Saint Petersburg State University to get my Master's degree, and then became a Research student at the Department of Systems Programming in 2015. The CODA project was founded in 2009 by Maksim Baklanovsky, my scientific advisor. It was when the Kaspersky Lab conducted a contest for creating the antivirus system of the future; the CODA project won the contest, but didn't develop until I joined Maksim Viktorovich, who now works at ITMO University as well. We've registered a legal entity, in 2013 I became the company's CEO, and in 2014 — a Skolkovo's resident.

ITMO University. Artur Hanov

Plans for the future

We want to release CODA as an independent product. By the way, we don't call CODA an antivirus — an antivirus implies complex protection of an operating system, and we believe that there is no need for yet another virus database. CODA, on the contrary, is more like a protecting coating for the system. After installing CODA, the programs that were already running will continue to run, but any new ones won't.

On February, 18th we will participate in the Demo Day of ITMO's Future Technologies accelerator; we will make a presentation of our project. By that time, we also plan to launch a website, where one can leave an email so as to sign up for alpha-testing. We expect to have a working prototype as soon as the first quarter of 2017. After being installed on a user's PC, CODA will start working in learning mode and create models of processes, which it will send on our server. There won't be any chance to identify a user by the data we collect — it would be just sets of numbers, binary mishmash. Yet, we will still launch a portal where every user will be able to see how the system works and whether there were any anomalies or security incidents. As soon as we are sure that our product is really effective in protecting user systems, we will start beta-testing. According to our plans, that would be in 2017's second quarter. After completion, CODA will be distributed by subscription.