Showing posts with label mexico. Show all posts
Showing posts with label mexico. Show all posts

December 13, 2011

Mapping Wikipedia Article Quality in North America

The maps of Wikipedia previously posted on the blog offer useful insights into the geographies of one of the world's largest platforms for user-generated content. They, along with similar visualizations, reiterated some of the massive inequalities in the layers of information that augment our planet.

But not all articles are created equally, and those maps didn't give us much of a sense of the quality of articles. "Quality" is obviously a slippery word and there are infinite ways of measuring it, but for the purposes of this post, we'll crudely use the term to refer to article length (future maps will employ a variety of other metrics).

The maps below visualize this measure of quality within Wikipedia entries -- yellow dots represent the location of relatively short articles in the English version of Wikipedia (e.g. the article on "Bandana, Kentucky"), while red dots indicate the location of relatively long articles (e.g. the articles on the "Republic of Molossia".


The map below displays the same data, but with smaller dots: making it easier to see some of the patterns if you expand the image.


Interestingly, the states with the highest average word counts are New Jersey (966) and Michigan (914). The states with the lowest averages are Delaware (534) and West Virginia (492). The reasons for these rather large differences are unclear.

Are Wikipedians from New Jersey that much more loquacious than their West Virginian counterparts? Or does it just take more words to describe the many dazzling wonders of New Jersey? Or is it something else entirely?

Apart from the obvious and increasingly evident urban bias in these information geographies, we'd certainly welcome your thoughts in explaining some of these patterns.

June 22, 2009

Information Inequality

Following on from the last post, here are some examples of Google placemark inequality:

First of all, China offers perhaps one of the most striking examples of regional disparities. Beijing, Shanghai, and the Pearl River Delta Region all are characterized by heavy information densities. In other words, a lot of information has been created and uploaded about these places. However, much of the rest of the country has very little cyber-presence within the Google Geoweb. In the map below, the height of each bar is an indicator the number of placemarks in each location.


The U.S.-Mexico border along the Rio Grande river offers a similarly striking contrast between high and low information densities.


The border between North and South Korea offers another example of placemark density not being correlated to population density. For obvious reasons, very little information is being created and uploaded about North Korea. In the map below (top), each dot represents 100+ placemarks. Interestingly, there are strong similarities between the map of placemarks on the Korean Peninsula, and satellite maps of lights visible from the Peninsula at night (bottom).


image source: globalsecurity.org

Information inequalities are clearly a defining characteristic of the Geoweb. Some places are highly visible, while others remain a virtual terra incognita. In particular, Africa, South America, and large parts of Asia are being left out of the flurry of mapping that is happing online (e.g. the Tokyo/Yokohama metro region has almost three times as many 0/1 placemark hits (923,034) as the entire continent of Africa (311,770)). Some of the geographical implications of cyber-visibility and invisibility have been examined in part (e.g. here and here), but there is clearly a lot more to be discussed. In particular, because Google allows any keyword to be searched for (not only "0" and "1"), we are able to explore not only the raw amounts of information attached to each place, but also the contents of that information.