Skip to content

Society for American Baseball Research Projects

Login SABR Home  

Blueprints
You are here: Home Architecture
Architecture
Thursday, 19 February 2009 14:40

Articles

The encyclopedia would have articles on all players, franchises, team-seasons, league-seasons, and any other topics as members saw fit to create them. Mediawiki articles are naturally free-form, but can easily be made to follow a consistent style using templates. For example, you can see that all of the player articles in the demo have "infoboxes" with player information. In this way, articles in the broad, common categories would be made to follow a similar format. This has been done successfully on Wikipedia, where the pages for any two countries will look very similar (it applies for many other categories as well).

Statistics

Statistics, and any numerical data, would appear to the viewer organized in tables, just as they would in almost any web page. Editing statistics, however, is more interesting, and is one of the important ideas of this plan. Data can be stored in tables in several ways, depending on its source:

  • Just in a table - for some data, just making it display correctly is enough. It would not make sense to create a complicated structure for the Harry Frazee article to store records of his financial trouble.
  • In a table with semantic markup - this is an exciting feature. Putting data into a table in this way makes it capable for a computer to understand the data and extract it, creating a CSV, spreadsheet, or database from it. This can be used as a tool to transition data from formats not conducive to making data useful (MS Word files, paper notebooks, etc) into usable formats. It also allows the markup to be added by someone other than the initial researcher. For example, a researcher could copy and paste valuable research from a Word document into a wiki page, yielding a temporarily poor page, and someone else could edit the page, making the information usable, without any coordination between the two. A third member, one of the wiki editors, could then flag the page for public viewing.
  • Imported into a "vetted" table from a database - this is a valuable feature for SABR. Mediawiki supports edits by "bots", programs that run on a schedule (or get triggered when changes occur) and automatically make certain edits to pages. Authoritative baseball stats would be pushed into pages by bots, and displayed as being authoritative. Any edits to these tables would be corrected by bots before an editor would have a chance to flag them for display.

Biographical data

We propose that we focus first on articles on people. Handling of biographical data is an area which is extremely fragmented in SABR at this writing. Basic demographic data for Major League players is the province of the Biographical Committee, and Bill Carle maintains a simple Access database with this information. BioProject offers fuller-length biographical sketches of not only some Major League players, but other people (and some ballparks as well). The Minor Leagues Database has the largest universe of people of any SABR data store, numbering around 170,000. As such, it has become the de-facto repository for much biographical information. This has been something of a stumbling block to the database project, as the time spent working on biographical data may be taking focus away from the enormous task of compiling and validating performance data. In addition, even the Minor Leagues Database does not contain the universe of people who have achieved a sufficient standard of notability; for instance, executives, scouts, major college coaches, and so forth don't fully fall within the Minor Leagues Database persons table, though many people in those categories did play professionally.

Other authoritative content

In addition to authoritative stats, the plan allows for other authoritative information. Research committees would have the option of having a private wiki that only committee leaders of their choosing could acccess. Information from these wikis would be included in pages on the main encyclopedia, and displayed in a special style to show that they were fully authoritative. To a viewer, there would be no appearance of the existence of a separate wiki, just authoritative sections in pages. There will be an example of what an authoritative include would look like in the demo.

Last Updated on Friday, 20 February 2009 09:24