Project RENOIR and Digitalization in the News Industry
Digitalization has an impact on countless areas of human activity, changing the way we work. Mass media is no exception: news agencies’ day-to-day operations have adopted the use of artificial intelligence and other cutting-edge tech. Last week, ITMO University hosted a lecture by Aljoša Rehar, head of the digital strategy department at the Slovenian Press Agency, who spoke about digitalization in the news industry and project RENOIR, developed by members of the media and the academia (Warsaw University of Technology, Stanford University, and others). ITMO University joined the initiative in June 2018.
Digitalization and Its Challenges
In recent years, the breakneck speed of technological development has had a great effect on the news industry. AI and other technologies shape the format and speed of news agencies’ everyday workflow and pose new challenges to the media community. Among them are a decline in established income flow (especially among newspapers, whose audience has shifted to other platforms), the diversification of platforms and formats (new platforms emerge where media outlets can share their content, e.g. Facebook, Twitter, etc. as well as form online communities, forcing companies to create varied content for each platform), the appearance of new, unorthodox competitors (bloggers, social media, small niche publications), a lack of funding and human resources.
According to Aljoša Rehar, one solution for these new challenges is digitalization. Its first benefit the expert says, is that it can reduce the time spent on routine work; numerous tools today give journalists easier access to information in many languages. The second benefit is an improvement in quality of services and products. For instance, finding content is made easy thanks to indexing technologies. Rehar also notes the new opportunities available for media companies with research partners and new ways of producing profit; traditional media usually lacks know-how due to a lack of financing, but cooperation between research labs and the media could be a solution.
News outlets test and introduce new technologies into their everyday procedures with increasing frequency, sometimes with help from research institutions. One example of such collaboration is RENOIR Project, an EU Horizon2020 Project in the framework of Marie Skłodowska-Curie Actions. The Slovenian Press Agency works on the project in collaboration with research labs from several countries; one such partner, since recent times, is ITMO University.
RENOIR is focused on the development of new processing mechanisms for social data. Participants of the project visit universities and partner companies to share their experience. The project is split into five sections, in which the participants exchange knowledge and innovations in the areas of data infrastructure, data-mining and machine learning, create innovative solutions related to data processing and analysis, and more. Four main partners work on the project: Warsaw University of Technology (Poland), Wroclaw University of Science and Technology (Poland), Jozef Stefan Institute (Slovenia), and the Slovenian Press Agency (SPA), as well as 11 other partners, which include Stanford University, University of California, and ITMO University.
Article tracking is a prototype tool developed by PhD students from Warsaw University of Technology. The technology helps find out how much content, and at what frequency, was copied by other companies from a news outlet’s website. Today, the SPA uses it to see how often their partners use their material. Using the Google Analytics API, developers and media creators learn who copies their material and how many clicks any piece of content gets.
Another complex solution, developed by the Wroclaw University of Science and Technology, focuses on automated sorting of articles by topics. It used to be that it was up to the journalist to decide on the categories for an article, making the human factor an issue; now, this is up to a system that analyzes existing articles to decide on the most relevant topics.
The most complex solution is called Event Registry. This tool was developed in collaboration with Bloomberg and the project’s academic members. Event Registry is global cross-lingual news aggregator that gives media creators better, faster access to information about current events regardless of the language in which that information is available. The system boasts automated analysis and sorting capabilities, and is also able to identify events by extracting them from news reports and clustering them by topics. It also offers several features useful for newsmakers:
The first is a ranking of the day’s top news. The registry provides a quick, simple selection of events that are most relevant at a given moment. It also does the same with people, organizations and locations, saving time and allowing users to work with content in multiple languages.
Another important function is assistance in presenting the events. When looking for material, journalists can use the system to compare various perspectives on the same events from various sources. As Aljoša Rehar explains, this instrument also lets news outlets see how others cover the same events, which is especially important when reporting on events happening in other countries.
The third feature is quick access to information about relevant global events. Newsmakers no longer need to monitor the news outlets of other countries, since the system does that for them. Moreover, it can also find news from various sources in a particular region. For example, SPA can use it to look up how some other country reports on the events in Slovenia, and what personalities or institutions from Slovenia are of most interest to readers abroad.
Event Registry can also automatically compile metadata for news articles (as many outlets still do that manually), update themed sections of webpages, visualize data such as sitemaps, graphs, etc., compare the efficiency rates of various news agencies, and more.
Staff at SPA plan to improve the efficiency of mechanisms responsible for detection of fake news, bias and mood analysis, event prediction, identification of breaking news and support for more languages, as well as automated news reporting (robojournalism).