Karen Spärck Jones: Unravelling natural language

Karen Spärck JonesOriginally published in the ebook A Passion for Science: Stories of Discovery and Invention.

by Bill Thompson

The renowned computer scientist Karen Spärck Jones died in 2007, aged only seventy-one. Her husband Roger Needham, another computer scientist who she’d married in 1958, had died of cancer in 2003 shortly after his sixty-eighth birthday. I wrote her obituary for The Times, as I’d written Roger’s four years earlier. I’d written an obituary for their colleague David Wheeler in 2004, and already had Maurice Wilkes’ on file, though it wasn’t needed until 2010 as he lived to be ninety-seven.

Although writing obituaries was never a full-time occupation, as a technology journalist with a computing degree I was regularly commissioned by The Times to cover well-known figures in the computing industry or computer science, and these four clearly merited coverage in “the paper of record”. After all, Spärck Jones, Needham, Wheeler and Wilkes had been key members of the generation that created modern computing and shaped the world we live in today, and it was important to reflect on their careers: without their achievements in the Cambridge University Computer Laboratory I think it unlikely that the world of computing would have the shape it does today.

I also wanted to mark their passing because the four of them were also my teachers. I’d studied for the Diploma in Computer Science at Cambridge, and they had all been teaching or working in the Lab. Writing an obituary of someone you know is very different from pulling someone’s life together from a quick clippings job and a few short chats with family and colleagues. I had known all of them, spent time in their presence, had to defend my views to them — with more or less success. I’d watched them as they lectured on their own work, and had to face the occupational hazard of writing an essay on a topic knowing that it would be assessed by the person who had actually made the theoretical breakthrough you were discussing.

I completed my one-year Diploma in 1984, and Karen was one of the few senior women in the Lab at the time. I discovered later that she didn’t have a full-time position for many years, and relied on short-term research contracts to fund her work. Despite this, Karen’s academic career was impressive: She published nine books and over two hundred substantial papers; she served as president of the Association for Computational Linguistics in 1994; and she was elected a Fellow of the British Academy in 1995.

Karen was a research fellow at Newnham College from 1965 to 1968, a Fellow of Darwin College from 1968 to 1980, and became a Fellow of Wolfson College in 2000, becoming an Honorary Fellow in 2002. In 2007 she was awarded the Lovelace Medal by the British Computer Society and was the first woman to receive it. She was also given the Allan Newell Award and Athena Lectureship by the American Association for Computing Machinery. She is not only one of the most significant women in computing, she is simply one of the most important people in computing.

Not all words are equal

Karen’s interest in language may have had a lot to do with her intellectual development, but it was also a matter of luck, as it is for most of us. Karen was born in Huddersfield, Yorkshire, in 1935 and after attending a local grammar school she came up to Girton College, Cambridge in 1953 to read history. After her degree she studied philosophy, or Moral Sciences as it was called at the time, for a year.

After a brief and unsatisfying spell teaching she was invited to join the Cambridge Language Research Unit (CLRU) by its director Margaret Masterman following an introduction from Roger Needham, a friend from undergraduate days who was studying for a PhD in what was then called the Mathematical Laboratory but is now the Computer Laboratory.

CLRU was working on natural language processing, looking at how computers could determine the meaning of sentences. Masterman, reflecting Wittgenstein’s philosophy, believed that meaning not grammar was the key to understanding languages and wanted to explore how machines could be programmed to implement this approach. For her part, Karen decided to try to build a thesaurus automatically, which meant transcribing the whole of Roget’s Thesaurus onto punched cards and working closely with Needham on ways to classify information automatically.

She married Needham in 1958, and obtained her doctorate in 1964. Her thesis, published as Synonymy and semantic classification, remains important even today.

In the 1960s she began working in the field of information retrieval, and in 1968 she moved from CLRU to the Computer Laboratory where she stayed.

You could say that Karen’s work made Google and Siri possible, if you were writing headlines for The Daily Mail and wanted to overstate a complicated chain of causality. She was, after all, a pioneer in information retrieval and natural language processing and her contributions helped lay the ground in which the seeds of modern search engines and speech recognition were planted.

For example she pointed out that not all words were equal when searching a text and, in her 1972 paper A statistical interpretation of term specificity and its application in retrieval, argued that not all hits should be weighted equally as the occurrence of keywords that are broadly distributed throughout the texts being searched matters less than the occurrence of terms that appear in few documents. Her work on information retrieval underpinned the development of search long before the web made it a vital area for research and product development.

While Needham and Spärck Jones both remained at Cambridge University after their marriage Needham rapidly obtained a tenured position and eventually became head of the Computer Lab while Spärck Jones had to rely on short-term research grants to fund her work until she was awarded a personal professorship in 1999. This was not a very stable existence. The grants had to have a principal investigator from the department, but Karen was funded as a Senior Research Associate and was not, technically, a member of the faculty. At least things seem a little easier now for distinguished computer scientists like Wendy Hall.

In October last year Stuart Schieber wrote a post about Karen for Ada Lovelace Day, noting that she was “a leader in my own field of computational linguistics, a past president of the Association for Computational Linguistics” and expressing his happiness that “because we shared a research field, I had the honour of knowing Karen and the pleasure of meeting her on many occasions at ACL meetings.”

Connected through computing

I can’t claim such intimacy, and I doubt that Karen ever noticed me in my days as postgraduate student hanging around in the Titan Room, but I observed her very carefully. In the early 1980s I was very active in the Cambridge University Women’s Action Group and had founded a small society called Men Against Sexism, holding debates and screening films like Rosie the Riveter to groups of similarly minded people. There were only four or five women on the Diploma in Computer Science out of forty or so, and Karen was one of very few female lecturers in the Lab. So I noticed her.

I was also interested in her research. I’d come to computing after having studied first philosophy and experimental psychology, and her work in language processing was especially interesting in contrast to lectures on compiler design and operating system scheduling algorithms — it offered an opportunity for reflection on the way words worked that appealed to me after years studying Wittgenstein. It was only while reading up for her obituary that I was reminded that her first academic job was in the CLRU looking at how computers could determine the meaning of sentences, working for a former student of Wittgenstein, which probably explains why I found her work intriguing.

I worked in the computing industry in Cambridge for several years after I graduated, some of the time at Acorn Computers, and would spend time in the lab and see her around at seminars and lectures, and we would talk about her work. Later, as a freelance writer, I’d do occasional pieces for the Cambridge Alumnus Magazine, and have a reason to visit the library. I remember passing Karen on the stairs of the old Computer Laboratory when it was in the tower on Corn Exchange Street, or exchanging a few words with her over coffee outside the library when I was in there to research a newspaper article, by which time she was Professor of Computers and Information.

In 1999 the Lab celebrated the fiftieth anniversary of the Electronic Delay Storage Automatic Calculator, or EDSAC, one of the first stored program electronic computers. I was there for the celebratory events which Karen had organised with her usual efficiency. I had the opportunity to see her in action and also catch up with her and other old friends and teachers as we looked back over the achievements of the computing industry and reflected on how they had built on the work done in the lab since EDSAC ran its first programme.

A legacy assured, if unwritten

Since Karen’s death, the British Computer Society has inaugurated an annual lecture that honours women in computing research in her name alongside their regular Lovelace Colloquium for women undergraduates in computing and related subjects. Karen’s reputation, like that of other woman computing pioneers such as Grace Hopper, Anita Borg and Barbara Liskov, seems assured, at least within the profession.

This would almost certainly have pleased her. She once said, “My slogan is: Computing is too important to be left to men”, going on to note, “I think women bring a different perspective to computing; they are more thoughtful and less inclined to go straight for technical fixes.”

There isn’t a biography of Karen Spärck Jones, despite her many achievements in computer science and language processing or her remarkable life, but then again there aren’t very many biographies of the post-war generation of British computer scientists who defined the field and laid the groundwork for today’s information society, male or female.

While all may have merited obituaries in The Times neither Karen, nor her computer scientist husband Roger Needham, nor the inventor of the subroutine David Wheeler, nor even Maurice Wilkes, builder of EDSAC and director of the computer laboratory where Karen did much of her work, have biographies that I can find, perhaps because they came of age and did their work in a time before we were all so entranced by social network sites and smartphones. We may have to wait awhile for these pioneers to find biographers willing to engage with their lives and work, as we did for Babbage and Turing, or perhaps we’ll have transcended the book format and will make do with extended Wikipedia entries for such things.

But I hope that when these biographies come to be written they will encompass the lives and achievements of Karen and the other women who did so much to build the discipline of computer science, kickstart the computing industry and shape the modern world.

Further reading

Abbate, J (2001), “Karen Spärck Jones: An Interview Conducted by Janet Abbate”, IEEE History Center Oral History, http://www.ieeeghn.org/wiki/index.php/Oral-History:Karen_Sp%C3%A4rck_Jones

Shieber, SM (2012), “For Ada Lovelace Day 2012: Karen Spärck Jones”, http://blogs.law.harvard.edu/pamphlet/2012/10/16/for-ada-lovelace-day-2012-karen-sparck-jones/

Spärck Jones, K (2007), “Karen Spärck Jones”, http://www.cl.cam.ac.uk/archive/ksj21/

The Daily Telegraph (2007), “Karen Spärck Jones”, http://www.telegraph.co.uk/news/obituaries/1548315/Karen-Sparck-Jones.html

The Ada Project, “Pioneering Women in Computing Technology”, School Of Computer Science, Carnegie Mellon University, http://www.women.cs.cmu.edu/ada/Resources/Women/

About the author

Bill Thompson has been working in, on and around the internet since 1984 and spends his time considering about the ways digital technologies are changing our world. A well-known technology journalist, he is Head of Partnership Development in the BBC’s Archive Development Group, building relationships with partners around ways to make archive material more accessible, and a Visiting Professor at the Royal College of Art.

During the 1990s Bill was Internet Ambassador for PIPEX, the UK’s first commercial ISP. He appears weekly on the BBC World Service’s Click, writes a column for Focus magazine and advises a range of arts and cultural organisations on their digital strategies. He is a member of the board of Writers’ Centre Norwich and of the Collections Trust, and was a Trustee of the Cambridge Film Trust. He also manages the Working for an MP website.

Web: thebillblog.com
Twitter: @billt

Posted in STEM Stories.