Guide on IT meetups in St. Petersburg. Part 3: Events for Data Scientists

Master's student from the Faculty of Information Security and Computer Technologies at ITMO University, Alice Koroshavina wrote a blog for ITMO news about the best IT events in St. Petersburg. Here you can find the first and second part of her blog.

Organizing meetings for data scientists, it is difficult to gather in one room people of the same background and specialization. There can be both those who apply ready-made solutions and those who develop methods and algorithms. Programmers hate long formulas on the slides, while mathematicians are afraid of the code. Deep lecture on Natural Language Processing can seem too complicated for both a newbie in Data Science and for an experienced researcher in Computer Vision. Based on the approaches to solving these problems, three types of events can be distinguished:

1) Meetups and breakfasts of ODS and PyData conference remind me of a kaleidoscope: one element (atmosphere) is repeated in many different fragments (events). Attendees come for fun, networking and inspiration. Presentations cover a diverse range of topics: neural network architecture, model optimization, scraping and data preprocessing, data processing pipelines. There is a little or no code on slides. Unsolved problems of math are not discussed. The emphasis is on trends, successful cases and pitfalls of using ML / DL to solve business problems.

2) If you have serendipity in maths, follow the announcements of Computer Science Centre and Research Lab at ITMO. Participants of their open lectures and seminars discuss papers on model validation and new approaches in ML/DL. Many lectors of CSC also teach at JetBrains Research. Sometimes researchers from other universities come to share their experience. A month ago IFMO and Huawei Lab started to organize seminars on Natural Language Processing. If you want to know what absolute hardcore is, visit seminar on acoustics in the RAS. The participants discuss unsolved problems in mathematics. Very few people, if any, could ever understand formulas on the blackboard, written by applicants for a candidate degree. However, professors say them that their results are trivial.

3) People with the mentality of a developer, involved in Data Science projects, should enjoy SpBDSM. These hardcore meetups with depth reports are not less useful than conferences but informal. Speakers are experienced researchers from large companies. The event is organized once in six months at LenDoc Bar with high ceilings and unobtrusive movie soundtracks. The atmosphere inspires participants to talk about programming as art, not craft. Something similar I've seen on the last Golang Piter meetup, organized in a loft. Usually meetups take place in the offices of partner IT companies. For companies it is like inbound marketing offline, making their HR brand more popular among developers. For communities it is convenient since companies help to organize high-quality broadcast / video recording and a coffee break. When a meetup is organized without support of any company, organizers have to set microphone levels on a mixer or provide speakers a lavalier microphone. Otherwise, they have to remove background noise from video record with such programs as Adobe Audition. Among speakers of SpBDSM there are both engineers who can fix bugs or implement new features in the source code of an open-source library and researchers who are focused on the practical results of their researches.

The organizers find speakers who talk about their experience, not about themselves. Speakers seem to unconsciously follow the phrase of Linus Torvalds: "Talk cheap, show me the code". There are no stand-ups, neither a lot of memes on slides. Only a few theses and a lot of code. By the way, mathematicians apologize for their "academic code" in Python looking like pseudocode (though everybody understands that one can not be good in everything). Before slides with long formulas, organizes make short coffee breaks. Speakers present their research results in a various range of areas: embeddings, data preprocessing and vectorization, interesting architecture solutions in Data Science projects. There are also talks on such relevant topics as the application of ML/DL methods in cybersecurity (biometric data, generative adversarial networks, data sanitization, etc.). On the sidelines, you can discuss features in different versions of Python, such as optional typing in Python 3.7, the capabilities of other languages ​​for more efficient implementation of algorithms, like multiprocessing TSNE in Go and how to write clean, maintainable, scalable code in a data science project.

Comparing the atmosphere of communities, organized around different technologies, we can see the connection with the business problems. Programming languages ​​are just instruments, but behind the decisions of their designers there is a philosophy that gives a positive approach for solving a certain set of problems.