Quotes on Personal Information Management
- date: 2018-05-27
These quotes are taken from William Jones and Jaime Teevan’s, Personal Information Management, 2007.
Here’s a (large, outdated) mind-map
Here are the quotes I found useful.
PIM is both the practice and the study of the activities people perform to acquire, organize, maintain, retrieve, use, and control the distribution of information items such as documents (paper-based and digital), Web pages, and email messages for everyday use to complete tasks (work-related and not) and to fulfill a person’s various roles (as parent, employee, friend, member of the community, etc.).
In rough equivalence to the input-storage-output breakdown of actions associated with a PSI, essential PIM activities can then be grouped as follows:
- Finding/re-finding activities move from need to information. These activities affect the output of information from a PSI.
- Keeping activities move from information to need. These activities affect the input of information into a PSI.
- Meta-level activities focus on the PSI itself and on the management and organization of the PICs within. Efforts to “get organized” in a physical office, for example, are one kind of meta-level activity.
Finally, there is sometimes discussion of personal knowledge management (PKM). Given the usual ordering of data < information < knowledge, one is tempted to think that PKM is more important than PIM and, ultimately, this may be so. One major challenge of PKM, however, just as with knowledge management more generally, is that the articulation of rules and “lessons of a lifetime” in a form that we (and possibly others) can understand. Knowledge expressed and written down becomes one or more items of information—to be managed like other information items.
Another reason orienteering may be preferred is that it can provide both an overview of the information space being searched as well as context about where the desired information is located in that space.
recognizing rather than recalling
Malone (1983) noted that spatial location helps support the finding of physical information. Similarly, the location of a piece of digital information is a particularly important type of context used when orienteering for personal information (Teevan et al. 2004).
Information keeping (or just keeping). Decision making and actions relating to the information item currently under consideration that impact the likelihood that the item will be found again later. Decisions can range from: (1) “ignore, this has no relevance to me”; (2) “ignore, I can get back to this later” (by asking a friend, searching the Web, or some other act of finding) to purposive seeking of information as a consequence of a need to satisfy some goal; and/or (3) “keep this in a special place or way so that I can be sure to use this information later”.
Information organizing (or just organizing). Decision making and actions relating to the selection and implementation of a scheme of organization and representation for a collection of information items. Decisions can include: (1) How should items in this collection be named? (2) What set of properties make sense for and help to distinguish the items in this collection? (3) How should items within this collection be grouped? Into piles or folders?
Information maintaining (or just maintaining). All decisions and actions relating to the composition and preservation of a personal information collection. Decisions involve what kind of new items go into a collection, how information in the collection is stored (Where? In what formats? In what kind of storage? Backed up how?), and when do older items leave the collection (e.g., when are they deleted or archived?).
Keeping decisions often occur in a gray area where determination of costs, reciprocal benefits and outcome likelihoods is not straightforward (Jones 2004). In the logic of signal detection, this middle area presents people with a “damned if you do; damned if you don’t” choice.
Both options—filing and piling—have their advantages and disadvantages. Filing information items—whether paper documents, e-documents, or email messages—correctly into the right [directories] is a cognitively difficult and error prone activity. Difficult arises in part because the definition or purpose of a [directory] is often unclear from the label (e.g., “stuff”), and then may change in significant ways over time. Determining a [directories] definition may be at least as problematic as determining a categories definition. Worse, people may not even recall the [directories] they have created and so create new [directories] to meet the same or similar purposes.
The fragmentation of information across forms also poses problems in the study of PIM. There is a natural tendency to focus on one form of information in order to mange the scope of inquiry—to study only email, for example, of the use of Web bookmarks, or the organization of paper documents. But a focus by the information form can have the effect of endorsing current application-centric partitions of information and the information fragmentation that results from these partitions—certainly on of the most vexing problems in PIM today.
People are hampered in their efforts to organize information by a lack of adequate system support for basic features (such as ordering or the reuse of structure) and by a lack of support for the integrative use of a single organization for different forms of information (e.g., e-documents, email, Web references, informal notes).
How should a person’s information be organized? Or can people effectively keep information for later use and repeated reuse without bothering much about information organization? For some people in some situations, the answer may be that organization is not that important.
But for other people, in other situations, their own intelligent efforts to structure and organize information are critical to the creation of supporting external representations of a project of problem.
For example, the right diagram can allow one to make inferences more quickly (Larkin & Simon 1987). The way information is externally represented can produce huge differences in a person’s ability to use this information in short-duration, problem-solving exercises (Kotovsky, Hayes, & Simon 1985). Different kinds of representations, like matrices and hierarchies, are useful in different types of problems (Cheng 2002; Novick 1990; Novick, Hurley & Frances 1999).
To support the researcher’s role as primary data collection instrument, Chatman (1992), advocates that researchers maintain three types of notebooks: (1) a field diary for recording observational notes, the “simple reporting of phenomena”, which is used, as part of triangulation, for testing criterion validity; (2) method notes, which consists of “strategies employed or that might be employed to obtain data”, and thus record observation and ideas about the usefulness of certain methodologies, their effects, and how they may be changed for future research; and (3) a theory notebook for “testing construct validity and the generation of propositional statements to explain phenomena”, which is used as part of theory building.
The analytic technique of “memoing” begins at the outset of data collection through recording certain observations and ideas in the theory notebook. Later, as data analysis progresses, these theory notes are rewritten in the form of extensive memos that connect the researcher’s thoughts on different phenomena and are later used as part of theory building, for which emphasis is placed on identifying negative cases or anomalies that refute any theoretical framework established for the investigation. Miles and Huberman (1992, p. 72) cite Glaser and Strauss’s (1967) definition of memoing, which is a classic: “the theorizing write up of ideas about codes and their relationships as they strike the analyst while coding … it can be a sentence, a paragraph, or a few pages … it exhausts the analysts momentary ideation based on data with perhaps a little conceptual elaboration”.
Ontology. Attempt to formulate an exhaustive and rigorous conceptual schema within a given domain. Often an ontology is a hierarchical data structure containing all the relevant entities and their relationships and rules (theorems, regulations) within that domain (Wikipedia 2006).
Taxonomies. Hierarchical structures for classifying a set of objects. They are less expressive than ontologies as a means for expressing structure on objects in the world. They only allow subclass relationships, and cannot represent relationships between concepts.
Thesaurus. A data structure designed for indexing, where we associate with every important term in the domain a set of terms related to it.
Among the many and diverse issues existing in [digital libraries] research, ranging from intellectual rights management to user collaboration to architectural interoperability, the most relevant from a PIM point of view are issues related to the management of metadata (and more generally with the so-called semantic interoperability) and the various types of multimedia data.
Metadata treats information as opaque, but offers a standard model for talking about those objects in assorted ways. Examples include grouping (as in file directories), annotating (as in ID3 tags for media and del.icio.us for Web pages), and linking (as in the World Wide Web)… much information management relies only on object metadata, [and] so can be supported over data objects of all types.
While a metadata scheme can ignore the complex formats of the objects it is talking about, there must still be a common API for talking about those objects. In particular, any metadata scheme needs some standard way to indicate which object is being talked about. This may be a standard scheme for naming objects (e.g., file names or URLs) so that their names can be used to [reference] them, or it might be accomplished by embedding the metadata with the object being annotated (as with ID3 tags in media files).
In defense of “flat hierarchies”.
As Malone and others have pointed out, a physical filing structure is a hierarchical category structure (Jones, Phuwanartnurak, Gill & Bruce 2005; Malone 1983). Putting an object into a tree structure is a process that is exquisitely sensitive to choices made while descending the hierarchy. An error made in choosing the category can result in a object being very far from its “correct” file location.
Dabbish et al. present a more systematic analysis of message types. Messages could serve multiple functions, but the overall breakdown was Action requests (36%), Information requests (18%), Information attachments (36%), Status updates for projects/tasks (21%), Scheduling requests/responses to scheduling requests (14%), Reminders for meetings/actions (16%), Social mesages (8%), and Other (12%).
Finally, there have been video studies of how people allocate time processing email (Bellotti et al. 2005):
- 54% composing mesages
- 23% reading messages, attachments, and links
- 10% filing messages
- 6% scanning inbox messages for things to read or deal with
- 2% deleting messages
- 2% looking for messages in folders
- 2% managing attachments