Photo by Jennifer Brommer, CC BY 4.0.

Darius Kazemi is an internet artist who makes bots—bots that mix headlines together and bots that make absurd flow charts and bots that buy him random items from Amazon each month. He also relies heavily on Wikipedia for both inspiration and material. His Wikipedia-related bots include the Bracket Meme Bot, which makes “arbitrary brackets, sourced from Wikipedia” and the Empire Plots bot, which comes up with “weird but plausible soap opera-type plots for the TV show Empire.”

I wanted to learn more about how Kazemi uses Wikipedia in his work. Our conversation is below.

———

What was your first project that used Wikipedia?

Oh gosh. Good question. I think it was probably Content, Forever—this was an extension of a failed National Novel Generation Month prototype. The idea is that it generates a semi-coherent “thinkpiece” type article simply by falling down a Wikipedia hole! I wrote a little post on how it works that might be of interest to your readers. It uses the parse action to get JSON formatted content, then parses the HTML clientside using JavaScript and jQuery.

———

In the write-up for the Empire Plots bot, you explained how you employed DBPedia. While DBPedia is structured content extracted from infoboxes on Wikipedia, there’s also a structured info database called Wikidata (run by our sister organization Wikimedia Deutschland). In some cases the infoboxes on Wikipedia are actually generated from the associated structured content on Wikidata and it too can be queried with SPARQL, the querying language you used with DBPedia, to return some pretty interesting datasets. Do you think there’s potential for Wikidata as a source of content in addition to DBPedia for bot makers?

Wikidata can absolutely be used in addition to DBPedia! Wikidata is always going to be more up-to-date than DBPedia, but main issue is whether Wikidata has the data you want. For example, for that bot I used the “dbt:subject” relation to determine if an entry belongs to the set of “African_American_actresses”, which doesn’t exist on Wikidata. I suppose that could be worked around with a JOIN operation of some kind, though!

Mikhail’s note: The following query,which can be used on Wikidata Query Service, finds African American actresses on Wikidata.

SELECT ?person ?personLabel WHERE {
?person wdt:P31 wd:Q5. # instance of: human
?person wdt:P21 wd:Q6581072. # gender/sex: female
?person (wdt:P106/wdt:P279*) wd:Q33999. # occupation: all subclasses of actor (incl. film, voice, and television)
?person wdt:P172 wd:Q49085. # ethnic group: African Americans
SERVICE wikibase:label { bd:serviceParam wikibase:language “[AUTO_LANGUAGE],en”. }
}

———

What is your general process when building a bot after you have the initial idea?

I try to start making the bot as soon as I have the idea, and the first thing I do is a manual, non-computational prototype. So if I have an idea for an algorithm, I will write down the algorithm and then attempt to follow it manually a few times to see if the results are good. So for @BracketMemeBot, I poked around Wikipedia’s Categories to get a sense of what my algorithm might look like, then manually created a few brackets to see the result. This is the research portion of the process, and it’s helpful because I can determine very quickly without writing any code whether or not a bot will be any good. I abandon a lot of projects at this point in the process because the output isn’t what I want, but also no code was written so it’s no big deal.

———

Has there been a time when you wanted to make something that relied on Wikipedia but it wasn’t possible using the MediaWiki API nor DBPedia? What were the limitations you ran into and what would be some features you wish the MW API endpoint had?

It’s tough to say because I almost always set out to do a certain thing, immediately discover it’s not possible, and then redesign the bot within the parameters of what is possible. Just like in my Empire Plots Bot, I had to figure out a way around the fact that there is no “dead African American actors” category on Wikipedia. But I have always been able to find a workaround on MediaWiki or DBPedia — although often, as is the case with Content, Forever, I need to add HTML parsing/scraping to the mix.

Probably the most specific road block I ran into was with Bracket Meme Bot. It would certainly be nice to be able to get a random Category, or to be able to at least get a count of the total number of Categories available and then be able to index to an arbitrary number in that range.

———

The size of English Wikipedia makes it a great source of content for bot makers but Wikipedia is also available in almost 300 other languages (with some articles being available in multiple languages), DBPedia is available in 125 languages, and Wikidata is language-agnostic (users can add labels & descriptions in their language). In the global bot-making community, have you seen your colleagues build bots in other languages besides English that utilized the multilingual aspect of those projects?

Yes, there have been translations of some of my bots. @TwoHeadlines has a Spanish version called @DosTitulos! Now that you mention it, I could probably easily translate Content, Forever by changing the language namespace on the API…

Mikhail Popov, Data Analyst, Reading Product
Wikimedia Foundation