Teaching retrieval of information: What do you leave out?

Hello, academia! One of Naaman’s new responsibilities is, of course, teaching. In the long run, the teaching load will be comprised of courses that are driven by my own research interest (Social Media class? Mobile Information?), as well as core courses in information science. I had tons of fun teaching Research Methods to a great group of MLIS students last semester. This spring, I will be teaching Retrieving And Evaluating Electronic Information to undergrads:

“In this course, students examine and analyze the information retrieval process in order to more effectively conduct electronic searches, assess search results, and use information for informed decision making. Major topics include search engine technology, human information behavior, evaluation of information quality, and economic and cultural factors that affect the availability and reliability of electronic information.”

Now there’s a topic that can had launched a thousand PhD theses… how do I pack it into one semester? As I see it, the class should be a combination of “how to” and “how it works”: both understanding how the technologies work and how to use it best (these are of course interrelated).

For now, I have the class set up in the following way (with thanks to Marie, Nina and Nick who taught this class before me):

First I will spend time discussion the basics of how to search. Starting from the very basic how to choose/iterate on keywords, through boolean operators and advanced search functions. Then, I will spend a few sessions talking about search technology, or how search engines work (you know, crawling/indexing/ranking). I believe that everyone should have an understanding of how search works in order to realize the bias and limitations inherent in the process. In the middle I will discuss the presentation of search results as well as the topic of browsers.

All this will take me 8-9 sessions (1.15 hours each) out of a total of about 25.

Then, beyond the generic search, there are other sources of retrieving information, even on the web. Directories, reference sites (e.g., dictionaries but many more) and business databases. Of course, specialized databases like, say, academic libraries and other digital libraries play a major role in this world, especially for university undergrads.

This concludes the very basic “what you need to know” about retrieving information. And about half the class sessions. But we’re only getting warmed up. Here are a dump of additional topics I am planning to cover: news, breaking information and tracking topics (alerts and feeds); Web Reference Tools (from Wikipedia to Yahoo Answers), which of course leads to the topic of information reliability; publishing information on the Web; economics of information (here’s another topic that can last a few semesters); legal aspects of information use (e.g., copyright issues, Creative Commons); bookmarking and knowledge collections; social media and blogs, and Multimedia search (of course).

This, together with student presentations and exams, will pretty much conclude the class. But there’s so much else one can cover… here’s what I left out for now: ethical and cultural aspect of information; information overload; mobile information retrieval; the semantic web (ok, “semantics on the web” maybe a better title); personal information management (e.g., Stuff I’ve Seen); non-text retrieval (e.g. location-driven information); the hidden web; Web of Data; phew!

Now, our undergraduate program at SCILS offers classes that touch in depth on many of these issues. But can I possibly leave these topics out from any basic retrieval of information class? Doesn’t everyone need to know about these? Is there anything else I left out that must be covered?

Whatever form the class takes, I am excited. Mostly, I am curious to see what undergrads these days know about search, and how their perceptions can change in 14 short weeks.

7 thoughts on “Teaching retrieval of information: What do you leave out?

  1. Ryan Shaw

    If I were teaching such a class, I would start by trying to frame the whole endeavor of IR in broad terms. It’s so easy to lose track of the big picture once you start delving into the specifics of IR. In particular, I would have them read excerpts from Patrick Wilson’s Two Kinds of Power, which distinguishes between the power to retrieve documents matching a description and the power to retrieve documents that fit a purpose. The former is amenable to automation, but the latter is not. The heart of IR is the struggle to make the former serve the latter, and no one has expressed it better than Wilson.

  2. Ofer Egozi

    Looking forward to reading about insights gathered in the process…

    To take Ryan’s idea further, I’d also distinguish IR from Data Retrieval, using Keith van Rijsbergen’s classic introduction.

    Additionally, I think one of the fascinating academic aspects of IR in the web era is that of large-scale evaluation.
    Similar to talking about Wikipedia, which demonstrates the, say, “explicit” wisdom of the crowds, you could talk about web search evaluation (and also website analytics in general) as a task that spawned a whole set of sub-fields that glean info from huge amounts of query logs, such as result quality, query rewrites (synonyms, spellcheck…) and advertising bids.
    It also boosted the concept of A/B testing as a viable and successful scientific methodology, and you probably remember Ben Schneiderman’s take on that 🙂

  3. naaman Post author

    Good pointer, thanks – I’ll check van Rijsbergen. I don’t think I’ll be getting into the evaluation, though. It is just an undergraduate class. It’s definitely a topic covered in the graduate IR class at SCILS.

  4. Pingback: Google’s Technology Statement: “Importance” over Time › The Ayman and Naaman Show

  5. Pingback: Class Edits Wikipedia: How Not to Win New Editors › The Ayman and Naaman Show

  6. Pingback: Teaching “Social Media”: Open for Suggestions! › The Ayman and Naaman Show

Leave a Reply

Your email address will not be published. Required fields are marked *