AI and the Future of Journalism

Content is the root of every our activity in the web.
artificial intelligence journalism news automation

Content is the root of every our activity in the web. We browse informations, search through news, products, always looking for something. Written content - texts, documents, chats - is crucial in understanding how we operate and is, I believe, essential to eventual creation of Artificial General Intelligence. That’s why for the past year I’ve been researching how AI can be applied to journalism. This text summarises some of my recent findings.


Project AI media


First of all I’ve ventured into AI journalism with PetaCrunch project (PetaPR for PR releases; and before that I wrote about Synthetic Content, the first step into this direction). The goal with PetaCrunch was to see what are real bottlenecks in journalists’ workflow like: sending emails, receiving materials, posting stories. My thesis was that 99.9% of non-investigative journalism can be automated — that is, gathering data is the hardest part. And after 6 months of being engaged with PetaCrunch I can basically confirm this thesis. 


PetaCrunch and PetaPR became semi-automated media concerned with business news and interviews. Worth noting is that PetaCrunch is posting only original content, that is interviews done over email with business executives and startup founders. Automation in email handling allowed to contact tens of thousands of businesses. We’ve targeted those which secured a funding to ask them about their plans and conduct a short interview. We’ve conducted over 1,000 interviews with companies from around the world. All that was done within 3 months. This showed me what’s missing when it comes to full automation and gave rise to Contentyze


PetaCrunch is still doing fine and you can see it for yourself if you follow the link. However it was just a step into a more universal direction: content creation at scale.


AI Content Marketing


Content is key for any news outlet. You should be able to publish relevant news regularly if you want to have visitors and monetise it. If you think about the spectrum of written content, it varies from low-level reporting of local weather news, sport news, business deals to high-level investigative journalism and involved opinion pieces (think New Yorker).


You still can’t automate the latter, however you very well can start automating lower levels. There are already tools which allow you to post fully automated news about football matches right after they were played, or news about traffic or new businesses in town. Chances are you were reading automated news already without realising it. And to be honest this is great for journalists who can focus on more creative tasks.


How does it work? Any news is a matter of connecting a data source to a template algorithm. A data source provides you a stream of data about a particular niche, and a template algorithm gives you a text based on this data (‘prompt’ using machine learning terminology). Template algorithms vary from purely software engineered to deep learning language generation models. 


The deeper you get into technology, the higher level content you can automate.


Future of Journalism


Having understood that AGI and text generation are related, my goal is to automate as much of content creation as possible. I want to make it into a larger platform and that’s how Contentyze came to be. I envision a universal platform which gives you texts based on data sources you provide. Hence the role of a journalist will be purely related to crunching data and choosing sources to pursue. Less or none writing.  


With Contentyze I plan to build a universal set of tools for media agencies, news outlets and marketing companies. We’re still not yet there, but we’re coming closer. In 2–3 years journalism will be ripe for disruption at unprecedented level. The last year was a wild ride and I can only see right now how far I’ve gone with these ideas: from synthetic content, through PetaCrunch to Contentyze and beyond (this part of the story has to remain untold for now).


On the final note, content generation is vastly related to journalism, but it’s much bigger than that. In the end we tell our stories through content: be that news, business documents or our daily communication. That’s why language understanding is the core functionality of any more involved AI system striving for completion. 


This little text will serve a role in the future. 


There’s no real end to this story, as it only unfolds right now. 


Stay tuned.


How to use Artificial Intelligence in Journalism



For the future reference — mostly for myself, but if you can use it, then that’s even better — I wanted to note some of the texts, materials, books I found useful or want to read soon. 


First of all there’s a blog Robot Writers AI, run by a tech journalist who notes recent use cases of how AI was applied by various organisations. Worth reading especially for more references and interesting commentary.


Then there are two recent books treating how AI is changing journalism:


1. Newsmakers: Artificial Intelligence and the Future of Journalism


2. Automating the News: How Algorithms Are Rewriting the Media


Both of these books are written by academics and surveys applications: from social media bots, to automating news writing and analytics used in investigations. 


There’s also a report done by LSE and Google News Initiative on AI in journalism, which you can find here.


I will add more sources here as I’ll become aware of them. Let me know if you have any suggestions.