November 6, 2023

Hi, Ivan! Navigating Life with AI: Insights from Ivan Yamshchikov

A month ago, when we launched Hi, AI! Media, we promised to tell stories about neural networks and the people behind them. To fulfill this promise, we're introducing our new column, "Navigating Life with AI." Our first feature is on Ivan Yamshchikov:

  • An AI scientist with an interview garnered over a million views on YouTube;
  • A Professor at the Würzburg-Schweinfurt University of Applied Sciences in Germany;
  • The creator and host of the science podcast "Air it Out!";
  • A consultant for startups in the field of generative AI and a key contributor to the Toloka.AI ecosystem;
  • In 2016, he collaborated with Alexey Tikhonov to release the album "Neural Defense", which consists of songs and poems written by AI. The algorithm they created wrote texts in the style of Yegor Letov, the founder of the "Civil Defense" band.

Ivan, let's start with what's current: for the past two weeks, our editorial team has been generating songs on the Suno neural network. Just last week, we integrated Suno into @GPT4Telegrambot. I'm interested in your opinion as one of the pioneers in creating generative songs. Have you had a chance to test it out?

I have indeed tried Suno. It's a step towards making music creation as effortless as listening. It's a clear tool that works great. Ada Lovelace, in the early 19th century, was the first person to suggest that computers could create music. Two hundred years haven't passed yet, and we've already managed. It will only get better. On one hand, music is becoming commoditized. Any person can create any music they want. That's pretty cool. On the other hand, the value of truly unique creations will continue to grow as always.

Alright, hold on, I didn't understand only one thing from this: what is commoditization?

It's when you have any item, which used to be available to few, becomes a commodity, that is, I don't know, it's probably called…

Accessible?

Even more. When you no longer notice that you are getting it. For example, clean water in Western Europe. We don't even think that it's actually a big deal. Water flows from the tap; in Germany you can drink it and not die and wash your face without contracting anything.

Commoditization is when everyone needs something, which plays a crucial role in our lives. Still, at the same time, it is so optimized, technologically advanced, and functions so reliably that, on average, we don't even notice we have it: clean water, light, electricity. Electricity is a commodity; in the modern developed world, nothing works without it. Now, internet access is going through a phase of commoditization. For instance, streaming services have made music accessible in terms of listening, meaning anyone can listen to music today, but creating it is still not easy.

Suno is a step towards making creating music as easy as listening to it. Suno enables you to create something that you specifically want. This difference might seem small, but in reality, it's gigantic. We see this with the television generation and the YouTube generation. The YouTube generation is much more proactive.

Have you already generated something on Suno?

Honestly, I'm still not done playing with images. I spent much time on Midjourney before, and now DALL-E 3 has been released. A friend of mine threw some news headlines into DALL-E today. And I can show you the result. The news goes like this: "Rostov Cossacks battle with hemp and crocodiles." The image is so good it should be on your Telegram. Look, here it is.

Perfect. Ready for print. We can produce a column in our media.

As for music, I like to work on it manually; I enjoy the process itself. I regularly record it at my desk for myself. Meanwhile, I'm sure that for my children, it will be normal to generate a song for a school project dedicated to space colonists.

Do I understand correctly that a new era has begun in music creation?

It's even more remarkable. An era is beginning where people are transforming from consumers into creators. This is a significant difference. One could argue about how much you are the author of the music created by Suno, but in some sense, you are the author, you are proactive, and you did something to generate a song. Or you came up with a prompt – and a picture appeared.

Is the neural network the author or co-author here? Who should own the copyright?

The concept of copyright is a weird, localized narrative of the 20th century, which we, hopefully, will soon forget. No one is the author. Newton said, "If I have seen further than others, it is because I have stood on the shoulders of giants." He did something incremental compared to the people before him, and he did it great, qualitatively. But it's still a result of some previous experience not of a single scientist but of generations. The same goes for music and everywhere. The idea of authorship is essential to me as the feeling that you are the author. It gives an entirely different understanding of the world.

What are your thoughts on the collective lawsuits by writers and artists against ChatGPT?

I don't understand the idea of intellectual property. It seems absurd to me. It's like some mass delusion. Like, what does intellectual mean? Leonardo da Vinci lived without intellectual property. Raphael lived without it. Bach lived without intellectual property.

When did you start working with neural networks?

7 years ago, when we released the album "Neural Defense." I realized I wanted to avoid engaging in financial mathematics, which I was writing my dissertation on, but wanted to delve into language generation. And, essentially, I switched.

Listen, it's also essential to allow oneself to change the field of activity.

Regarding the ability to switch, there's a great quote by Robert Heinlein that I cite to my students: "A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects."

The uniqueness of human intelligence lies in the ability to see perspectives and find unexpected analogies between different things, processes, and ideas.

What other art projects have you done with neural networks?

In 2017, we generated music, the central theme for the opening of the Yandex (Russian leading IT company) conference in Moscow. The AI "listened" to 600 hours of Scriabin's music, and Petr Termen, who is the great-grandson of Lev Termen (Ed. note: the inventor and creator of thereminvox, one of the first electronic musical instruments), came up with the arrangement and performed it along with a chamber orchestra. This is one of the first generative compositions that an orchestra played. Also, our exhibit was displayed in Bonn at the technical museum – these were generative texts of Kurt Cobain.

Do you consider yourself an AI Artist?

I am interested in what is called computational creativity, which is an attempt to comprehend the phenomenon of creativity from the perspective of computer science and algorithms. But I don't consider myself an AI Artist. No, I am an AI Scientist. My primary work is science. And almost all the papers that I like are written about generative language models.

What's your 10-year forecast on how AI will change our lives?

In the coming decade, education will probably change a lot because the old ways of testing what students have learned are not working well. We're witnessing a trend towards the gamification of learning. The role of teachers will rely more on interpersonal interaction. Roughly speaking, we'll revert to the paradigm of the ancient Greeks, where an academy is when you stroll through a garden with a teacher discussing various topics. This movement seems right to me.

There will be a substantial increase in productivity among "white-collar" workers. These roles, in particular, will experience a substantial boost in productivity.

First, ChatGPT changed how we write texts, and then Midjourney altered how we make images. Where will be the next breakthrough?

It's already happening – it's multimodal models, that is, image and text together. A lot is organized around language, and if you can use natural language to create new things, you can change a lot for the better.

Do you use neural networks for searching the Internet?

I use Google. Plus, sometimes, when I need to organize information, I use ChatGPT. Nothing is surprising about it.

Wait, is Google artificial intelligence?

Yes, definitely. How else? All modern search engines are built on neural networks, like the algorithms that match your query with an answer. Search development has contributed significantly to advancing the technologies now used in generative models because there's a lot of data available, and it's easy to train on it. So, roughly speaking, ChatGPT is trained on the entire Internet. The ability to index a vast amount of textual information, which any search engine has, is the foundation on which modern generative models are built.

Millions of people worldwide use AI without realizing it. When you google, you are using artificial intelligence. When YouTube suggests the following video, you are using artificial intelligence. When you open Netflix, you are using artificial intelligence because the movies are selected for you by algorithms. Machine learning is a process where you have an algorithm that improves a given quality function through feedback. In the case of Netflix, the quality function is probably how many likes you gave to what it showed you on the first screen. When you open TikTok, the quality function is how many seconds you watch the video.

Conduct a straightforward experiment, take your Instagram, and start giving feedback on advertisements. Very quickly, the ads will become visually more interesting than your friends' posts. Try interacting with the ads, like the ones you enjoy and dislike the ones you don't. And in a week, it might turn out that the ads are more appealing. They will provide you with more aesthetic pleasure than the average post in your feed.

Thank you for the idea. How do you use AI in your work?

In my work, I regularly use ChatGPT for summarizing and simplifying information, especially when editing texts. I ask ChatGPT to highlight problem areas in the text and identify unclear fragments. Especially when writing texts in English and German.

Can you give an example?

For example, I've put together a scientific article and noticed that some sections could use improvement. I use ChatGPT, which I've set up as an editor for my scientific texts, to help refine the writing. I introduce the article in our separate chat, stating the prompt that ChatGPT operates as an editor for specific scientific journals and conferences within a particular professional context.

My instructions to ChatGPT often involve enhancing the style of the text. I provide text examples for reference. Sometimes, I even request the text to be edited as if it were done by a person who Ray Bradbury and Elon Musk jointly raised. Post-editing by ChatGPT, I always ensure to proofread the text.

I often use it for brainstorming. Together with sociologists, we worked on an application for researching social networks using NLP (Ed. note: natural language processing). In social networks, there is a concept of "echo chambers" when people repeat the same opinions in a closed circle. We needed to come up with a name for the scientific grant. It should be a beautiful acronym in English. After the first round of ideas, I gave feedback, pointing out the options that needed to be simplified or unremarkable, and suggested coming up with another ten ideas, focusing on the best of the proposed. As a result of brainstorming with AI, the name ECHO was proposed with the decryption Enhancing Communication for Harmonious Online interactions. This is a successful example of an acronym that reflects the essence of the project.

When dealing with extensive business texts, I often use the summarization feature to highlight the main ideas.

In the scientific field, instead of generative models, recommendation systems are used more often. There are several services that, based on the analysis of previously read articles, suggest reading other scientific works. Such a Netflix for scientists. I use Mendeley.com.

How do you use AI in your everyday life?

In my life, I try not to use AI, preferring to walk in the fresh air, interact with my wife and dogs, and not use what I do in my work.

The interview for Hi, AI! media was conducted by Anatoly Buzinsky.