Wednesday, October 24, 2012

federated backpack demystified

What is a federation?
Off the top of my head definition of a federation would be; "A federation is a collection of different and dispersed things that have agreed to (or have been forced to) work together, while also remaining independent within themselves". I also like the definition given from a google search on "federation".
fed·er·a·tion /fedəˈrāSHən/
Noun:
1. A group of states with a central government but independence in internal affairs.
2. An organization or group within which smaller divisions have some degree of internal autonomy.
Ok, so a federation is how things collaborate / work together for a greater good and how a number of things bubble up into a larger collective. Two terms that are have been used within the technology realm are federated search and federated databases. Both these fit well within building an understanding of the federated backpack.

Federated search is a favorite of mine due to a past project where we created federated legal search across eight different content sources. The idea being that when you do a legal search the results should come from different content sources, providing different views or legal positions or sets of information. You really want to get a holistic view when doing legal research and how better than looking into the consolidation of a number of different sources, each which reference a different set on content regarding the same search term. The challenge comes in because, most often, the different content sources don't organize information in the same way. They use different taxonomies, or identify different information elements as important, or the underlying data is stored in different ways. The heavy lifting of federated search is in reconciling these different data (or information) structures so that they can bubble up into a single, well structured, search result. Cool thing was we used open source for the solution. There was a bunch of additions we needed to make, the search engine we used was Lucene SOLR. All good, and the nice thing was it has great features to reconcile the different data structures.

Centralized vs Federated Databases
The federated databases are similar to federated search in that they bring together data from different sources, reconcile the differences and display query results as a single, well structured, result. The accompanying diagram compares centralized database (left-side) with a federated database (right-side). Centralization means that all the data is stored in one database within one model (or a single data structure) for all the data in the database. The federated database transparently maps all the different databases into one single database structure. The mapping will accommodate for any differences in data structure among the different databases. This mapping approach allows autonomy between the databases so they can alter their data structure to better suit their particular needs. Again, the heavy lifting comes in reconciling the data's structural differences and having the technology to provide a single, well structured query result.

Click here is you want a more detailed description of Centralized vs Federated databases.

So there you have it, two good examples of federations within the technology realm. In summary, a federation is the grouping of autonomous sets of data with common elements. When viewing into the federated data it looks to be one set of data where the processes and tools surrounding the federation reconcile the differences of the data sets.

badge is saved in OBI database.
What is the backpack? really.
Important Point 1: Currently the Mozilla digital backpack is a view into your badge data as stored in the Open Badges Infrastructure (OBI). The OBI currently runs on the Mozilla servers that are hosting http://beta.openbadges.org. So when you "store" your badges in your backpack, you are actually storing your badges in the database of the OBI and the backpack is providing the view into that data. Some of the data structures within the OBI are dedicated to the backpack, currently all the data is stored within the OBI database. When it comes to federation the storage location is an important distinction.
Important Point 2: The OBI is open source and can be hosted on pretty much any server that supports the technical requirements of the OBI. This means that many instances of the OBI can exist. Therefore, many backpacks can also exist. This is where the concept of federated backpacks begins.

As the Mozilla Open Badges Infrastructure (OBI) matures and proliferates it will mean badges will be stored in different physical servers and these servers can be distributed through-out the world. People will be storing their badges in these different servers, and be making reference (maybe) to these servers for display of their badges. The idea of the federated backpack allows a consolidated view of all the badges a person has earned, regardless of where the badges are stored.

Why is the federated backpack the holy grail of open badges?
Back in May of 2012 Chris McAvoy (Product Lead for Mozilla Open Badges Project) wrote a post that said that the federated backpack was the holy grail of the open badges infrastructure. So what is meant by the federated backpack being the holy grail of the open badges project. I'd mostly say cause it is the milestone where they could consider themselves finished (well not really). I believe the day when Mozilla could sever its responsibility from hosting an OBI instance is the beginning of when the OBI could truly be released into the public domain.This is also the day when other instances of the OBI are up and running and all these instances openly exchange information about the badges they contain. They become a federation of Open Badges Infrastructure (OBI) instances; or in other words, the federated backpack. This would also mean that Mozilla would no longer have to host an OBI instance and could focus more deeply on making the OBI code base rock solid and continue advocating for an open metadata standard for the digital badge.