Searching with Precision

WR3 | Workshops

Searching with Precision

This workshop is designed to help you perform keyword searches of library catalogs and electronic databases with efficiency. While most students have some familiarity with keyword searching through the Google search engine, library catalogs and databases don’t work in the same way. To query databases effectively and efficiently, you need to be familiar with Boolean searches.

Keyword searches often produce too many results to examine in a reasonable amount of time. Keyword searches also suffer from large volumes of irrelevant results. Rather than page through hundreds or thousands of search results or sift through unwanted ones, you can use what are known as Boolean searches to make your searches much more precise. Boolean searches use what are known as logical operators to form search strings that help you zero in on the content you seek.

The three most common logical operators are AND, OR, and NOT. However, you should also learn how to use exact phrases, truncation, and wildcards in search strings.


Workshop

Study the following information, then complete the problem sets.

Operator Purpose Example
AND Narrow a search by adding additional keywords Tom Hardy AND shirtless
OR Broaden a search by adding additional keywords Ohio OR Virginia
NOT Prune search results by removing certain keywords Vikings NOT football
"..." Return results containing an exact phrase “artificial intelligence”
* Truncation: used to capture all the word endings from a search term manufact*
? Wildcard: Finds spelling variations organi?ation, wom?n

Boolean search examples

These Boolean terms are described in more detail below:

AND

Although it may seem counterintuitive, AND is used to narrow the number of sources you retrieve from a database. You can visualize the search of a large academic database or library catalog using the following diagram:

{ linguistics AND cognitive AND childhood }

In the search depicted above, a student has requested articles that contain the subjects cognitive, linguistics, and childhood. This particular search will only retrieve articles that contain all three terms. This small subset of the larger subject sets is referenced by the arrow. All the information represented by the other portions of the three circles will be excluded. Thus, even if an article contains two of the three search terms, it will be excluded from the results.

OR

Unlike the AND operator, OR seeks to broaden a search, as in this example:

{ Virginia OR Ohio }

In the search depicted above, a student has searched for the subjects Virginia OR Ohio. This search will return every article having the subject of Virginia as well as every article with the subject of Ohio. Unlike the AND search, where only articles containing both terms are returned in the search results, the OR search yields every source on both subjects regardless of whether those subjects appear together in the same source. As a consequence, the OR search will produce far more results.

  • Since the OR operator lacks precision, it is most often used in parenthetical searches, described below.
NOT

The Boolean operator NOT is used to subtract or screen out topics or keywords that are unwanted within the search results.

{ “alternative energy” NOT solar }

In the search depicted above, a student is researching alternative energy and wants to exclude any information dealing with solar energy. To remove all references to solar energy, the student has searched for “alternative energy,” but has removed any articles from the search results that contain the subject solar using the operator NOT.

The NOT operator is helpful when you find your search results are polluted with unwanted items. This is often a problem when two distinct things share the same name. For example, if you were researching the Norse explorers known as the Vikings, you might discover that your search results include unwanted information about the Minnesota Vikings football team. You can subtract these unwanted results by searching for Vikings NOT Minnesota or Vikings NOT football, for example.

This search uses quotation marks to form an exact phrase search, described below.

Advanced searching

These Boolean operators can be used to create long, increasingly precise search strings when they are fortified with more advanced operators: parentheticals, exact phrases, and wildcards.

Parenthetical searches

You can also use the various Boolean search terms in tandem using parenthetical constructions:

  • (Ohio OR Virginia) AND unemployment

  • (cognitive AND linguistics) NOT childhood

Such parenthetical searches follow the order of operations, like in math equations. In the first example, the search will first combine all the articles with the keyword of Ohio to all the articles with the keyword Virginia, creating a large collection of search results. Afterward, the keyword unemployment will be applied to that collection using the AND operator, yielding the final search results that look for sources dealing with unemployment in either Virginia and Ohio. Similarly, the second example creates a large collection of results that share the subjects cognitive and linguistics, then all the items having the term childhood are removed from the results.

Exact phrase searches

Most Internet search engines and library catalogs default to the AND operator when multiple terms are entered, even if it has not been typed by the user. For example, if you search for artificial intelligence, the search algorithm will actually use the search string artificial AND intelligence to produce your results. In some circumstances this may produce undesirable results. For example, we might imagine an article about the “intelligence” of using certain “artificial” sweeteners in food for children. This is not an article that is relevant for your project.

To avoid this problem, you can instruct your search engine to perform what is known as an exact phrase search. This is performed by placing quotation marks around the exact words you are searching for:

  • “artificial intelligence” AND apocalypse

By searching for “artificial intelligence” your search results will only contain items that have that exact phrase within the document or title.

Truncation and wildcards

  • manufact* (truncation)

  • wom?n (wild card)

If you search for the terms steel AND manufacturing, your search results may not include results with the terms manufacturer, manufacture, manufactured, or manufactures. As a result, you may not discover articles or books that are important to your research. By truncating the word with an asterisk, however, you will gather all the relevant search results.

Similarly, if you only search for woman, you will potentially miss out on the all the texts that mention women or womyn. Or imagine that you are doing research on a certain type of organization. It would be wise to search for organi?ation, since much of the English-speaking world spells the word with an “s” instead of a “z” as we do here in the US. Without this, your research may become skewed to favor the literature on US-based institutions. However, using the wild card ? you will search all spellings simultaneously, gathering all the relevant results. The question mark wild card should be used to replace a single letter only.

To illustrate, look at these two searches on the JSTOR database:

Problem Set

  1. You want to examine fictional portrayals of Arab-Israeli conflict. Create a Boolean search string to search the library catalog with.

  2. Narrow the previous search by examining only fictional portrayals involving Israelis and Palestinians available in our library.

Answer

This research question seems to arise from someone who would like to get a general look at the universe of texts that have a particular thematic focus. We are not looking for a particular book; we want to find all of the works of fiction that deal with this historical conflict. This gets me thinking about using controlled vocabulary terms provided by subject headings.

However, I didn’t know what the proper subject headings were, so I started off with keyword searches using terms like these below, trying to capture some sources that I could use to isolate some subject headings:

Israel* AND Arab AND Fiction

Israel* AND Palestin* AND Fiction

Israel* AND Arab AND Motion Pictures

These keyword searches led to many good hits from which I extracted the following subject headings:

  • Jewish-Arab relations
  • Fiction
  • Motion Pictures
  • Palestinian Arabs
  • Arab-Israeli conflict
  • Arab-Israeli conflict -- Literature and the conflict

Then I performed some subject searches like these:

1. Arab-Israeli conflict AND Fiction

2. Arab-Israeli conflict AND Fiction AND Palestinian Arabs

These subject searches in the catalog allowed me to find all the books that we have on these subjects. This type of searching is very useful for getting the 20,000-foot view of a topic. At some point, however, you must narrow your focus. So you will likely have to do some additional searches once your interests become more precise.

And since these subject searches are only going to return books (and some physical media), you should also think about proper keyword searches for academic databases which you must search separately. Since these searches are about fiction, we should find databases that specialize in fiction: the MLA database, for example.

  1. Find peer-reviewed sources that examine Herman Melville’s novel Moby Dick and the Cold War that were written between 1990 and 2000.
Answer

Here I used two exact phrase searches in a keyword search and then used the library’s search delimiters to narrow the search for peer-reviewed publications between 1990 and 2000.

"Moby Dick" AND "Cold War"
  1. You are curious if there is any scholarship on fictional portrayals of pandemic disease.
Answer

This problem is also one that we should tackle though the use of a controlled vocabulary terms. First I began with some general keywords that I associated with this idea: pandemic AND fiction. After finding a few relevant hits I was able to discover subject headings that allowed me to dive deeply into what is available on the subject. Significantly, I learned that the controlled vocabulary term I should probably use is “epidemic” and not “pandemic” as I had first thought:

  • Plague in literature
  • Fiction -- History and criticism
  • Fiction
  • Epidemics in literature
  • Diseases and literature
  • Criticism, interpretation, etc
  1. You want to know if certain governments are using the COVID-19 pandemic as an excuse to spy on their citizens.
Answer

This problem is more challenging for several reasons: 1) the current pandemic is a very recent event, which makes finding peer-reviewed sources less likely; 2) the keyword “spy” is not very productive since the controlled vocabulary term preferred by libraries is “surveillance”; and 3) the word “surveillance” is also term of art that is used to describe the legitimate efforts by epidemiologists and government health organizations to understand the spread of disease, thus polluting our results with sources that we don’t actually want.

To overcome these issues we may have to come at this by relying on journalism outlets we respect and any government publications or open-source intelligence we can find. And we may need to add additional keywords such as rights, privacy, libert*, or freedom* to try and find sources that favor our intentions with the keyword surveillance.

I started with a Google search using the following search string, which produced several good leads:

government* AND covid* AND surveillance AND privacy
  • A number of these results cite other related texts in their bibliographies, so a citation chase will be helpful to locate other sources.

  • We might further augment this search string by using the names of particular countries and/or intelligence services of interest.

I reproduced this search string in our library and found several peer-reviewed articles:

government* AND covid* AND surveillance AND privacy
  1. You are interested in female MMA fighters and want to know if there is any scholarship on this.
Answer

This search gave us an opportunity to use a wildcard. By using “Wom?n” we can ensure that we get sources that reference both “women” and “woman”. And the exact phrase search helps us find articles that only mention the words “mixed martial arts” in that exact order.

Wom?n AND “mixed martial arts”
Wom?n AND “mixed martial arts” AND gender