By Tiffany Whitfield

Jian Wu, assistant professor of computer science at 圖朸厙, has been recognized for his work on CiteSeerX, a world-renowned academic search engine.

Wu contributed to the work of C. Lee Giles, the David Reese professor of information sciences and technology at Penn State University and creator of the search engine. Wu, Giles and a team of computer scientists were honored by the Information Retrieval Specialist Group of the British Computer Society (BCS) with the Best Open Source Project award at the organization's . The BCS recognizes people, projects and organizations that have excelled in the design of search and information retrieval products and services.

Jian Wu

Giles developed , an adaptive, worldwide large-scale open-source academic search engine that launched as CiteSeer in 1998 and was . Wu joined the team in 2012. This search engine houses more than 10 million full-text English documents along with metadata from 32 million authors and 240 million citation mentions. More than three million users globally access the site, allowing for one billion hits and hundreds of millions of downloads every year.

"The team had to overcome both financial and technical challenges to maintain such a production system in an academic setting," Wu said. "The BCS award is a recognition of the persistent work of several generations of team members."

From its inception, CiteSeerX, was created to adapt to users' requirements.

"Automatically, we were able to bring up how many citations a paper had gotten," Giles said. "Indexing based on importance was revolutionary at the time."

To perform this indexing and information extraction as scale, CiteSeerX uses several machine learning methods.

The digital archive search engine was one of the pioneer platforms that implemented the automated citation indexing technique to connect papers and researchers as a network. It actively crawls and harvests academic and scientific documents online and uses automatous citation indexing, making it possible for users to find related papers using citation graphs. It is often considered a predecessor of academic search tools such as Google Scholar and Microsoft Academic Search.

"Dr. Wu is a very productive and creative researcher," said Gail Dodge, dean of College of Sciences at 圖朸厙. "We are proud of his contribution to the innovative CiteSeerX project, and congratulations to him and his team on receiving the BCS award."

"I am very glad to hear that Jian received the prestigious BCS award because this indeed is a recognition of his long-term commitment to the CiteSeerX project," said Ravi Mukkamala, professor and chair of the Department of Computer Science. He is very active in research and is a shining star among the new faculty that the department has recruited in recent years."

Wu is working with Penn State researchers on the next generation CiteSeerX.

"We are refactoring CiteSeerX from Solr Lucene and mySQL to Elasticsearch, all of which is open source," Wu said.

The BCS has more than 60,000 members in 150 countries and is a charity with a royal charter that aims to lead the information technology industry through its ethical challenges, support the people who work in the industry and make IT good for society. To learn more about the CiteSeerX, click

Related News Stories

圖朸厙 Awarded $400,000 to Grow Diverse Cybersecurity Workforce Pipeline

The University is partnering with Virginia Tech and Norfolk State University to prepare ROTC and DoD-aspiring students for post-graduate careers. (More)

Coastal Virginia Cybersecurity Student Association Completes Inaugural Capture the Flag Competition

CyberForge 2022 took place March 26 at 圖朸厙, with teams of cybersecurity students competing in various challenges. (More)

圖朸厙 Modeling Software Helps Vulnerable Communities Plan for Wildfire Evacuation

Researchers at the Virginia Modeling, Analysis & Simulation Center developed a free tool that estimates evacuation times for communities threatened by natural or man-made disasters. (More)