Skip to content

Big Scientific Data and Text Analytics Group

Our mission

The mission of the Big Scientific Data and Text Analytics Group (BSDTAg) is to advance the state-of-the-art and develop new technologies powered by AI in the area of the machine processing of scientific information.

We identify with the use of AI for the public good. We carry out research to empower the next generation of researchers to be able to more effectively access, understand, interpret and build on open knowledge and to do so in line with the principles of open science.

More specifically, we:

  • apply AI to improve ways in which research is conducted;
  • develop novel technologies enabling systematic analysis of research data and literature;
  • create services to improve access to scientific information for all
  • do research on research;
  • support the transition to and raise awareness of the benefits of open research;
  • work with companies to help them derive value from research data and scientific information in areas as diverse as analysing trends, detecting misinformation and plagiarism detection.

Our research

Our research lies at the intersection of the following areas:

  • Data science, natural language processing, machine learning, data mining, big data
  • Information retrieval, information extraction, recommender systems, semantic web
  • Open science, scientometrics, scholarly communication

Our collaborators:

  • Industry: we advise, provide analytics and deliver new technologies for organisations and innovative industries in areas as diverse as checking and detection of misinformation, analysing research trends, plagiarism detection, research impact evaluation, expert search and recruiting, academic search engines and literature-based discovery.
  • Academic institutions: we deliver innovative tools and support academic institutions with an analysis of their research outputs, open access compliance, trends, comparisons to their rival institutions in the context of research assessment exercises.
  • Funders: we collect data from thousands of institutions and facilitate monitoring of open access compliance and reporting.
  • Partner projects: we derive our reputation from strong collaboration with some of the most prestigious organisations in the area of scholarly communication.

OUR PUBLICATIONS

Our Team

team member

Petr Knoth

Founder & Head of CORE
team member

Lucas Anastasiou

Senior Developer
team member

Valeriy Budko

Full Stack Developer
team member

Matteo Cancellieri

Lead Developer (Backend)
team member

Jozef Harag

Full Stack Developer
team member

Drahomira Herrmannova

Data Analyst & Researcher
team member

Alexander Huba

Software Developer
team member

Catherine Kuliavets

Personal & Team Administrative Assistant
team member

Suchetha Nambanoor-Kunnah

PhD Student
team member

Samuel Pearce

Developer and Infrastructure Specialist
team member

Nancy Pontika

Open Access Specialist & Communications
team member

David Pride

Data scientist
team member

Maria Tarasiuk

Software Developer
team member

Viktor Yakubiv

Senior Front-End Developer
team member

Anna Zelinska

Project manager

LATEST NEWS

New workflow for adding new data...

Catherine Kuliavets

CORE's mission is to increase the discoverability of open access research and promote as widely as possible the content of our data providers, i.e., repositories, journals, and web resources. We currently collaborate with more than 10,000 data providers from around the world and are continuously looking for new ways to increase this number to offer an as complete as possible coverage of the world's open access content. More information about the new workflow for adding new data providers and gaining access to the CORE Repository Dashboard can be found at Jisc Scholarly Communications. Related Links: Read more.

CORE update for April to June 2020

Catherine Kuliavets

Despite the global situation caused by the pandemic and the ongoing changes, the second quarter of 2020 has seen significant progress in the operation and development of CORE – new products have been released and the team reached new achievements. Follow the link and be informed about: 1. 20 million monthly CORE users and growth of CORE's worldwide rank 2. CORE Repository Dashboard and Repository Edition releases 3. CORE helps Lean Library to provide OA research papers 4. CORE Ambassadors' network and achievements 5. CORE Discovery and repositories 6. CORE team research accomplishments 7. CORE negotiations and partnerships 8. CORE Statistics. Related Links: Read more.

8th International Workshop on Mining...

Catherine Kuliavets

Due to unprecedented events following the global pandemic situation, this year, the 8th International Workshop on Mining Scientific Publications (WOSP), 2020 was fully organised virtually. The entire workshop constituted a single day, with four sessions, featuring keynote talks, with accepted paper presentations and a shared task on citation context classification. More details regarding the programme structure can be found here. The workshop this year was organised by CORE, The Open University, UK, in collaboration with Oak Ridge National Laboratory (ORNL), Tennessee, US. Find out more here. Related Links: Read more.

Tool to Support with REF2021 Open...

Catherine Kuliavets

CORE is happy to announce the release of a new version of the CORE Repository Dashboard. The update will be of particular interest to UK repositories as we are releasing with it a new tool to support REF2021 open access compliance assessment. The tool was developed for repository managers and research administrators to improve the harvesting of their repository outputs and ensure their content is visible to the world. Full details here. Related Links: Read more.

CORE reaches 20 million monthly users

Catherine Kuliavets

Thousands of data providers from almost 150 countries from all over the world are connected with researchers, students, life long learners and the general public via CORE. This past month CORE's monthly users reached 20 million - we are really proud of it and grateful to all our content providers. Read more about this here. Related Links: Read more.

The Joint Conference on Digital...

Catherine Kuliavets

Members of the CORE Team have been working on submissions for the Joint Conference on Digital Libraries (JCDL) and today we are extremely happy to inform our readers that our two teams have both received acceptance notices. Doctors Bikash Gyawali, Dr. Nancy Pontika and Dr. Petr Knoth have been working on "Open Access 2007-2017: Country and University Level Perspective" while David Pride and Dr. Petr Knoth worked on another submission entitled, "An Authoritative Approach to Citation Classification". Follow the link and find out more details about this. Related Links: Read more.

CORE helps Lean Library to provide its...

Catherine Kuliavets

CORE follows its mission and makes open access more visible and reusable by being an enabling infrastructure. This time CORE joins its forces with Lean Library, whose aim is to provide seamless access to research materials for users. Due to this collaboration with Lean Library, the CORE Discovery service will now be indirectly used by library systems integrating Lean Library, thereby reaching more users. More information about this integration can be found here. Related Links: Read more.

3C Shared task: A Kaggle Competition...

Catherine Kuliavets

As part of the International Workshop on Mining Scientific Publications, WOSP 2020 (https://wosp.core.ac.uk/jcdl2020/index.html), researchers at CORE are organizing a new shared task: the '3C' Citation Context Classification Task. The aim of this shared task is to classify the citation context in research publications based on their influence and purpose. There will be two subtasks associated with this shared task and these tasks will be hosted on Kaggle as separate competitions. Subtask A (https://www.kaggle.com/c/3c-shared-task-purpose) is a multi-class classification task, where the citations are categorized as six different classes based on the purpose. The second subtask B (https://www.kaggle.com/c/3c-shared-task-influence) is a binary classification task, based on the citation influence. More information can be found in this blogpost. Related Links: Read more.

Track compliance of the REF2021 open...

Catherine Kuliavets

At the end of March, CORE presented a webinar (slides and recording) on how UK HEIs can track compliance with the REF2021 open access policy. The webinar was fully booked and attended by 131 repository managers and research administrators from the UK Council of Research Repositories (UKCoRR) and the Association of Research Managers and Administrators (ARMA) groups. During the webinar the CORE Team presented the CORE Repository Dashboard, a tool specifically designed for research support staff, which contains functionalities that provide useful information about tracking compliance with the REF2021 open access policy. Some of the topics that were discussed during the webinar are: Deposit compliance Deposit time lag Publication dates RIOXX metadata Read the full blog post to learn more, and access the slides and the webinar recording. Related Links: Read more.

Congratulations to The Core and iTunes...

Kiran Parmar

Every day, millions of people access free OU content. Congratulations to all members, past and present, of our CORE and iTunes U teams for being recognised as two of the five Open Access sources mentioned on the OU's Mission Page. CORE is the world's largest collection of open access research papers delivered in partnership by the Open University and Jisc. CORE hosts over 19 million Open Access full text papers and allows searching and accessing over 170 million research papers. KMi were instrumental in launching the ITunes U podcasts, in partnership with the Open University Learning and Teaching Solutions division. KMi are extremely proud to see these achievements acknowledged; the teams' hard work over many years has paid off, benefiting our students, and exemplifying the OU's core values. Related Links: http://www.open.ac.uk/about/main/strategy-and-policies/mission

CORE update for January to March 2020

Catherine Kuliavets

CORE is extremely happy to keep its reader up to date and here is its quarterly report for January to March 2020 period. Read CORE Blog Post which includes: CORE is ready to release a premium version of the Repository Dashboard CORE's products are used by Open Access Helper CORE is continuously expanding its ambassadors' network CORE step by step guides CORE as an enabling infrastructure CORE Statistics Related Links: Read more.

CORE welcomes a leading figure in the...

Catherine Kuliavets

Last Tuesday, March 3, we were privileged at CORE to welcome a leading figure in the quest for Open Access to scientific knowledge. Carl Malamud and Petr Knoth had a very productive discussion of their work, common goals and shared their experience. What is more, Carl Malamud has given a talk at KMi on text and data mining in scientific journals. For more information about his talk, read here. Related Links: Read more.

CORE raises repository data quality by...

Catherine Kuliavets

Read about our work on going beyond mirroring content from our data providers to improve data quality. In our latest blog post, we present how we link CORE data to complementary scholarly sources and databases including Crossref, MAG, and ORCID. Related Links: Read more.

Are you an iOS user? Access scientific...

Catherine Kuliavets

Claus Wolf, with CORE's support, has developed the OA Helper - a brand new application, which enables iOS users to search for scientific articles in their devices without hitting a paywall. Fascinated by Open Access and Open Source, Claus Wolf implemented CORE Discovery and CORE Recommender into this application. Claus Wolf says: "Open Access provides a level playing field on which innovation can be built and also serves as a field for learning. Creating a tool that would support Open Access for macOS & iOS users thus seemed like a worthwhile endeavour and it turned out to be a great learning opportunity for him." To install this application on your device, just visit the Apple Store site. Related Links: Read more.

CORE welcomes Plan S

Chris Valentine

CORE supports Plans S in granting open access to scholarly outputs (such as publications) to anyone without any barriers and restrictions, including to most forms of use and re-use by humans and machines. PlanS is an initiative supported by the European Commission and various national public funding bodies ("cOAlitionS") who, from 2020, will require that all articles by their grantees must be published immediately OA. Plan S is developed by Science Europe - the European association representing the interests of major public research performing and research funding organisations. Read more about this at the CORE Blog. Related Links: Read more.

KMi, the first 25 years

Jane Whild

At the 25th Anniversary KMi Festival we invited staff from across the OU campus to come and find out how our latest knowledge and media technologies are impacting education, science, and cities. The Festival attendees included Lady Kitty Chisholm, one of the three founders of KMi, the STEM Executive Dean, Nick Braithwaite, and the new CFO Paul Traynor. Visitors had in-depth conversations with the research teams and tried out some hands-on exhibits. Among the fun demonstrations were the Knowledge Makers' Memory Game, and the Citizens Science Team's Biodiversity Quiz, which invited visitors to test their powers of observation by comparing photographs of bees and butterflies which have been uploaded by the general public with the species catalogue, to identify the correct variety of species. There was also a demonstration of the Open Field Lab kit which is used to broadcast live student fieldcasts through the OU's Stadium Live system. Other stands were showing: Our Social Media Analytics work on online misogyny was recently featured in WIRED Our learning analytics platform (OU Analyse) which has been shortlisted for a Times Higher Education Award. The OU Analyse Team are OU REsearch Excellence 2019 winners. Our scholarly data platform which CORE now attracts more web traffic than the OU or FutureLearn websites. The CORE team are aslo OU Research Excellence Awards 2019 winners. Our collaboration with Springer Nature on scholarly analytics has produced a new standard classification scheme for Computer Science as well as a solution for automatic metadata generation, which is now in routine use at Springer Nature. Our blockchain based accreditation work underpins the £20M funded Institute of Coding which was highlighted at the OU's TEDx event. Our award-winning MK Data Hub which was recently applied to power the world's first Smart City Robot competition in Central Milton Keynes. Back in 1994, the KMi founders had a vision of what the future of knowledge and media would be like and KMi was created to implement such a vision. During this 25 years KMi has established itself as a world-class research centre and the KMiFest was a great way to come together to celebrate the impact that KMi's intensive research has delivered.

CORE team wins at the Research...

Chris Valentine

CORE was presented with the Outstanding Impact of Research on Society and Prosperity Award at The Open University's Research Excellence Awards Ceremony 2019 which took place at the MK Stadium on October 23rd. Over 150 Researchers, academics and support staff attended with Professor Kevin Hetheringthon, Pro Vice Chancellor for Research, Enterprise and Scholarship hosting. The new Vice Chancellor of the Open University, Professor Tim Blackman also attended and welcomed the guests at the beginning of proceedings. The awards were presented by Professor Monica Grady and we were extremely lucky to also be joined by Professor Dame Jocelyn Bell Burnell who presented the Open University's 50th Anniversary Prize for Research. Representing CORE were Dr. Petr Knoth, Matteo Cancellieri, Lucas Anastasiou, Bikash Gyawali, David Pride and Alan Fletcher. Balviar Notay of Jisc, a key partner of The Open University in delivering CORE also joined the awards ceremony. Dr. Petr Knoth commented: "I am delighted to win this award and would like to thank everyone who contributed to the development of CORE since its start in 2010, including our key partner Jisc as well as other funders, our users and commercial customers and our fantastic and talented current and former staff. It has been a great pleasure for me to be able to lead this fantastic team that is so passionate about the mission of opening up research knowledge to all, including not just researchers, funders and librarians, but also the millions of our users from within the general public who can discover and access the results of research they have contributed to by paying their taxes. It has been a privilege for me to be able to run this project from within the Open University which fully embraces the mission of openness, lifelong education and knowledge sharing. I hope the work of my team, a service that has become an essential part of the open access infrastructure, will contribute to making the Open University a centre of excellence for Open Research in the future."

CORE makes it to the top 5,000...

Petr Knoth

CORE has just made it to the top 5,000 websites globally accroding to Alexa Global Rank, which is calculated from a combination of daily visitors and page views on a website over a 3 month period. As of today, CORE ranks at 4,924, climbing 871 places over the last 90 days. The improvement is impressive considerring that academic websites typically experience a seasonal slowdown during the summer break. As a result, the rank is likely to improve even further. For instance, these are the ranks of the Open University (10,605), British Library (17,925), Directory of Open Access journals (24,255) and OpenAIRE (508,394). These are very strong data, showing the amazing value for money of CORE to the society.

KMi becomes a partner in new EU-funded...

Nancy Pontika

Success in research and innovation should primarily build and depend on clarity of thought, innovation of ideas, and integrity of processes, rather than on external factors like prior reputation or levels of resources. Open Science and Responsible Research and Innovation aim to bring equity and inclusivity to research. Yet could policy interventions in these directions actually worsen existing inequalities? ON-MERRIT studies „Matthew effects" of cumulative advantage in Open Science and Responsible Research and Innovation across research, industry and policy-making, through a mix of sociological, bibliometric and computational approaches. Where we discover such effects at play, we will make policy-recommendations to mitigate or negate these effects. Dr. Petr Knoth explains: "ON-MERRIT will help us to better understand the flaws of the axiomatically established indicator-based incentives system that is currently deeply driving academic practice. This understanding will enable us to apply data-driven approaches to seek new counter-measures that reward based on merit and incentivise good research practices, such as reproducibility, transparent research workflows, and open research data and software sharing. The project will build on the expertise acquired by the Big Scientific Data and Text Analytics (BSDTAG) group in the area of open science and big data analytics which has been developed through a series of KMi projects supporting the CORE (core.ac.uk) service over the last 8 years." ON-MERRIT will be launched in October 2019 and runs until March 2022, with total funding of 1 million Euros from the EC's Horizon 2020 programme. Partner Consortium KNOW-CENTER GMBH - Research Center For Data-Driven Business & Big Data Analytics, Graz (AT) TU Graz - Institute of Interactive Systems and Data Science, Graz (AT) THE OPEN UNIVERSITY - Knowledge Media Institute (UK) UNIVERSIDADE DO MINHO - Minho (PT) GEORG-AUGUST-UNIVERSITÄT GÖTTINGEN - Göttingen State and University Library (DE) Related Links: Original Press Release

KMI's CORE highly visible at Open...

Nancy Pontika

CORE participated at the Open Repositories conference (10 - 13 June 2019), which took place in Hamburg, Germany. This year's conference theme was "All the user needs", where CORE received much attention and participated actively with 5 presentations: Assessing compliance with the UK REF2021 Open Access Policy Comparing the performance of OAI-PMH with ResourceSync CORE Analytics Dashboard Analysing the performance of open access papers discovery tools The future of scholarly communications professionals Read more… Related Links: CORE's conference presentations

KMi's CORE to be used in the Research...

Nancy Pontika

CORE a global aggregator of open access content and UK's national aggregator will be assisting the UK Research and Innovation's audit for the REF2021 by supplying the deposit date information for all UK REF outputs. Compliance with the REF2021 Open Access Policy is established when an author deposits the post-print or Author's Accepted Manuscript in a repository, institutional or subject, within 90 days from its acceptance. The REF audit committee will consult CORE for discovering the deposit date and decide whether an output is compliant with the policy or not. The Research England's REF Audit Guidance specifically states that: "40. We will undertake verification of the dates that outputs became publicly available, particularly where they were published early in the REF period or are marked as 'pending' publication (for example, by obtaining a letter from the publisher). This will include checking the publication year against the Crossref4 database and against Jisc CORE" …. "46. We will assess each HEIs' overall compliance with the REF 2021 open access policy by: … iv. Using Jisc CORE, comparing the datePublished and depositedDate and identifying where the number of days between the two dates is greater than 92." .... "49. Where there is insufficient evidence to demonstrate a robust and well-managed process for open access, we will identify a set of outputs from each submission made by the HEI, and request further information to verify whether they are compliant with the policy, or whether an exception applies. Outputs may be selected randomly, or based on information in unpaywall.org or Jisc CORE, or a combination of the two. We will select outputs that have been returned as compliant with the policy, and/or outputs that have been returned with exceptions." For more information visit the "Audit Guidance" here. Related Links: Audit Guidance

KMi Researchers Win Vannevar Bush Best...

Petr Knoth

KMi researchers Dasha Herrmannova, Nancy Pontika and Petr Knoth win Vannevar Bush Best Paper Award at the Joint Conference on Digital Libraries (JCDL 2019) for their paper titled: "Do Authors Deposit on Time? Tracking Open Access Policy Compliance". JCDL 2019 is an A* conference (highest rank) and the world's highest ranking venue for digital libraries research, within the top 4% of all computer science conferences according to the Computing Research & Education Conference Portal. JCDL has taken place this year at the University of Illinois at Urbana-Champaign, United States. The paper, which uses data from CORE (core.ac.uk) to quantify the growing traction of open access has received media attention even prior to the presentation at JCDL 2019 and was featured in an article in Physics Today. Dasha Herrmannova says: "I am utterly astonished and still can't get over the fact that we won the best paper award last night at this amazing conference." Petr Knoth says: "It has been a pleasure to work with this amazing team. We went through many revisions on this paper as the work turned out to be more complicated than we originally anticipated, but it paid off." What is even more impressive is that this is the second Best Paper Award in a 12-month period at key digital libraries conferences for the Big Scientific Data and Text Analytics Group (BSDTAG). David Pride and Petr Knoth won the Best Paper Award at the best European DL conference - TPDL in Porto, Portugal in September 2018 for their paper: "Peer review and citation data in predicting university rankings, a large-scale analysis". Related Links: Study quantifies the growing traction of open access Do Authors Deposit on Time? Tracking Open Access Policy Compliance

KMi's CORE partners with Turnitin, a...

Nancy Pontika

CORE, the world's largest aggregator of open access scientific content and Turnitin, a global leader in plagiarism detection software, have entered into a collaboration. Using CORE's FastSync service, Turnitin's proprietary web crawler will search through CORE's vast global database of open access content and metadata—135 million metadata records from over 3,700 data providers and counting—to check for text similarity. "As the scholarly publishing industry evolves, Turnitin's services must similarly adapt," said Valerie Schreiner, Turnitin SVP Business and Corporate Development. "This partnership with CORE ensures that our database remains at the forefront of publishing trends and can continue to best serve the needs of our customers and partners." Access the Press Release here. Related Links: Press Release

KMi's FOSTER eLearning platform...

Nancy Pontika

A Nature article titled "Data sharing and how it can benefit your scientific career", which discusses the importance of opening up research data and how data sharing could benefit researchers, mentions the FOSTER Open Science eLearning Portal, developed in KMI. The EU funded "Facilitate Open Science Training for European Research" (FOSTER) and its continuation project "Fostering the practical implementation of Open Science in Horizon 2020 and beyond" (FOSTER Plus) have established the FOSTER eLearning portal, which has been implemented in KMI. FOSTER offers more than 1,300 training resources, 45 courses (offered either self-paced or moderated) and five learning paths leading to specialisations in Open Science. KMI's Big Scientific Data and Text Analytics Group (BSDTAG) has participated in the two FOSTER projects. KMi fully developed and hosts the training technology and has also contributed to the creation of the training content and courses. Related Links: FOSTER portal

KMi researchers' study quantifying the...

Drahomira Herrmannova

A study by Drahomira Herrmannova, Nancy Pontika, and Petr Knoth of KMI has been featured in Physics Today, the flagship publication of the American Institute of Physics (AIP). The study evaluated the time it took for academics to deposit some 800,000 papers in repositories in relation to when these papers got published. The bibliometric data for the study came from KMI's CORE. As the Physics Today article noted, the study found that while the time to deposit has been decreasing globally, the change has been particularly pronounced in the UK. In fact, since 2016, UK-based scientists have been posting their papers online more quickly than those in the other four nations with the highest number of papers in the dataset: the US, the Netherlands, Italy, and Switzerland. The REF 2021 Open Access Policy, which requires depositing papers within three months of their acceptance date, may have accelerated this trend in the UK. According to the authors, the key message of the paper is that this observation supports the argument for the inclusion of a strictly time-limited deposit requirement in OA policies. The study has also found significant differences between deposit practices at different universities, suggesting that institutions play an important role in supporting Open Access. The study will be presented at the ACM/IEEE Joint Conference on Digital Libraries in Urbana-Champaign, IL, in June. The code and the dataset used in the study are available online. Related Links: Physics Today article The study

CORE releases a new front-end

Petr Knoth

We are very excited to announce that CORE has released a new front-end marking the end of Phase 1 of front-end improvements, which will continue with 2 more phases. The key highlights of the new UI are: A more modern yet functional look and feel. Support for mobile devices. A new and better presentation of CORE's mission and services. Cross-browser support covering over 95% of CORE's users. Accessibility improvements. Removal of single point of failure dependencies, taking full advantage of CORE's high availability infrastructure. But what is an end to one thing is a start to another. The objectives of Phase 2 are now: Taking CORE's search experience to a new level. New functionalities for the online CORE Reader Improvement to some existing static pages Special thanks here to the everyone involved in this release: Viktor Yakubiv, Tom Davey, Matteo Cancellieri, Balviar Notay, Samuel Pearce, Sergei Misak, Svetlana Rumyanceva, Nancy Pontika and Petr Knoth. Related Links: CORE - The worlds largest collection of open access research papers

CORE hits 10 million monthly active...

Petr Knoth

CORE usage has increased dramatically in 2018 and has hit the 10 million monthly active users mark in January 2019 (10.41 million users). This is a 571% increase of users compared to January 2018. As of January 2019, CORE was the 5,448th most used website globally according to an independent Alexa Rank. This rank is calculated from a combination of daily visitors and page views on a website over a 3 month period. To put this into perspective, at the time of writing this document, the rank indicates that CORE has significantly more users than Futurelearn (rank 6,083), The Open University (rank 8,849), British Library (rank 12,702), Jisc (rank 75,663) and many other significant organisations. Related Links: Alexa Global Rank for CORE

CORE partners with Naver, South...

Petr Knoth

CORE, the world's largest aggregator of open access scientific content, and Naver, South Korea's number one search solution, have entered into a collaboration that will see CORE's content being made available to 42 millions Naver users. As part of the collaboration, Naver ingests data collected by CORE to enrich its Naver Academic search system with millions of open access papers. The aim of both services is to provide free access to scientific publications and make the experience seamless. Read more Related Links: Read more on the Jisc website

Knowledge Makers v5.0 Raspberry...

Danita Davidson

The Knowledge Makers recently took over the Berrill Theatre and Mezzanine for their fifth, and rather special event. Over 100 attendees from across all faculties took part on the day and joining them were two guests from The Raspberry Pi Foundation; Philip Colligan, CEO and Dr. Sue Sentance, Chief Learning Officer. The event kicked off with a 'Raspberry Research' showcase where OU researchers displayed their current research or teaching projects that are using Raspberry Pi. Eleven teams showed off incredible variety of use-cases for the single board computer. KMi demonstrated a strong presence with their OpenBlockchain and GreenData project teams, ably represented by Michelle Bachler and Chris Valentine, attracting a good deal of attention from the attendees. The KMi SciRoc team brought along some of their recent developments in working towards bringing robots to smart cities. Other researchers from across the OU were also in attendance, notably from the OpenSTEM labs who showed off their incredible Mars Rover, the MAZIZONE team who brought a range of interactive and engaging displays. Teams from STEM also took part with projects from healthcare (STRETCH) to networking, with the 'Network in a Box' being used to teach networking concepts via the OU Cisco Academy Following the showcase, Dr. Petr Knoth opened the keynote session with the results of a new investigation showing how Raspberry Pi is being used in research globally. The data that informed this research was drawn from the full-text articles held in the Core dataset. Excitingly, Core recently became the world's largest legal repository of full-text scientific articles. An engaging keynote by Philip Colligan about The Raspberry Pi and the foundation rounded the day off after which he was presented with a framed 'Raspberry Research word cloud' built using data from the Core research project. Overall, the event was a huge success. New partnerships and friendships were formed and a great time was had by all. The Knowledge Makers will be back in December for 'A Very Maker Christmas' which will be taking place in the Library at Walton Hall (date tbc) Visit http://knowledgemakers.kmi.open.ac.uk to see details of this and all the other Knowledge Makers events and workshops.

CORE mentioned in a Nature article on...

Nancy Pontika

CORE has received a mention in a Nature article titled: "How AI technology can tame the scientific literature." The article discusses how Artificial Intelligence (AI) assists researchers, and in general those who are in need of scientific information, with discovering new knowledge from the vast amounts of available scientific literature. It is estimated that up to two research papers are being published within one minute, making it difficult for everyone to retrieve, read and digest all this content. As a result, new services that use machine learning, natural language processing, and algorithms are emerging. CORE has been mentioned in this context due to its collaboration just with Iris.ai, a literature-exploration tool powered by artificial intelligence, that is fully reliant on data supplied by CORE through its API. CORE provides a number of data services and is capable of offering enterprise machine access to a large corpus of research papers using a newly developed service called CORE FastSync. Related Links: How AI technology can tame the scientific literature

KMi researchers, David Pride and Petr...

Petr Knoth

The best paper award at the 22nd International Conference on Theory and Practice of Digital Libraries (TPDL 2018) went to the paper authored by David Pride and Petr Knoth titled "Peer review and citation data in predicting university rankings, a large-scale analysis." The paper conducted the largest analysis of REF2014 data so far (data of 145 thousand submitted papers, 7 million citations across all 36 REF Units of Assessment/disciplines), looking at the link between peer review, conducted by REF nominated panels, and bibliometric indicators. The study found surprisingly high correlations of the REF results at an institutional level (Grade Points Average - GPA) with simple bibliometric indicators. This indicates that 2014 REF results could have been predicted using automated techniques to a high degree of accuracy for about a third of the disciplines, those with high average citations per paper. If such approach was adopted for just those disciplines, this could result in savings to UK universities and Research England of about £50 million every time a national exercise is run and even more if more disciplines adopted a similar approach. Since the preprint of this study was made available, a number of researchers have made contact with us and confirmed that they have since obtained similar results. This information is now being discussed with Jisc, who finance the project, to advise Research England on the next steps. TPDL 2018 is the highest regarded conference in the area of digital libraries in Europe and 2nd worldwide. TPDL 2018 took place in Porto, Portugal. Pride, D. and Knoth, P. (2018) Peer review and citation data in predicting university rankings, a large-scale analysis, Theory and Practice of Digital Libraries (TPDL) 2018, Porto, Portugal Lecture Notes in Computer Science, Springer, https://arxiv.org/abs/1805.08529

Knowledge Makers' 3D Printing Workshop...

David Pride

On the 30th May, the Knowledge Makers organised the first 3D printing workshop to take place at KMi. 23 people from all OU faculties attended and got hands on with an introduction to the software tools used to design 3D objects and also got to see some 'live' printing and a range of finished examples. The attendees were given a brief overview of OpenSCAD, an open-source 3D design tool, and were then tasked with designing an object of their choice in just 90 minutes. The results were truly amazing, clearly demonstrating what happens when you combine engaged and enthusiastic participants with powerful tools. Some wonderful designs were realised, including chairs, dice, wheels, ladders (albeit very small ones) - and even Tower Bridge! Outstanding design of the day however went to team 'Piggy Bank' who worked flawlessly together, each member producing one section of the final model. See the photos below for the impressive end results. The session gave the opportunity to bring people together from across the OU and introduce them to a new skillset. Additionally, attendees were introduced to some of the amazing facilities available on campus including the FabLab and Rapid Prototyping Lab. There is a wealth of talented people and fantastic resources here at the OU, we firmly believe events like this one can help to bring the two together. About the Knowledge Makers We are a growing group of enthusiastic makers, hackers and tinkerers who hold bi-monthly meetups at the Walton Hall Campus. We actively encourage makers and crafters of ALL varieties to get involved. It does not matter what your making passion happens to be, we believe sharing your passion is what makes a difference. Related Links: Knowledge Makers on the web Knowlege Makers on GitHub Twitter

Drahomira Herrmannova successfully...

Petr Knoth

A KMi research student Drahomira Herrmannova (Dasha) has successfully defended her PhD thesis titled: "Mining Scholarly Publications for Research Evaluation." While current research metrics evaluate the excellence of a publication based on the number of interactions in the scholarly network, such as the number of times it has been cited (Bibliometrics) or downloaded (Altmerics), this thesis explores the use of publications' full texts in research evaluation. The thesis first investigates what research quality is and then defines a new class of research evalution metrics called Semantometrics and its first metric called contribution. It then demonstrates, on a newly created True Impact Dataset, that the contribution metric can be more effective in identifying key research than existing research evaluation metrics. Dasha's examiners were Prof Enrico Motta and Dr Iana Atanassova of University of Franche-Comté. Dasha's supervisors were Dr Petr Knoth and Prof Zdenek Zdrahal. Dasha will continue her research in this area at the Oak Ridge National Laboratory in the United States. All the best Dasha!

CORE, a KMi service, partners with...

Nancy Pontika

The CORE service is working in partnership with ProQuest to deliver more content within their library discovery services (Ex Libris Primo and Ex Libris Summon). What does this mean for the end user? This means that search results will bring back more relevant content from OA repositories worldwide in addition to the existing library collection records. The user will not have to go to a separate search interface to run the same search query. Read more...

No image available

KMi PhD student kicks off new Jisc...

David Pride

A new blog launched yesterday by Jisc focuses on their Open Metrics project which aims to support the development of new research metrics. Following the publication of The Metric Tide report in 2015 there is increasing awareness within the sector of a need for new research evaluation metrics that move beyond the limitations of traditional citation-based metrics. The launch included a piece that introduces David Pride, a PhD. research student at KMi. David's current research looks at the large scale evaluation of research articles using the publications' full-text. You can read the post here: https://openmetrics.jiscinvolve.org/wp/2017/11/citations-created-equal/ And the Open Metrics blog can be read here: https://openmetrics.jiscinvolve.org/wp/about/

KMi researcher visits Ethiopia and...

Nancy Pontika

EIFL's invitation to KMi's CORE project to take part in a workshop for researchers from developing countries pays dividends for participants and for CORE. In June 2017, EIFL invited the global open access full text aggregator CORE to take part in an Open Science train-the-trainer course for universities and research institutions in EIFL partner countries. Read more on EIFL's post and check CORE's blog to watch the videos of the workshop participants talking about CORE. Related Links: EIFL blog post CORE blog post

CORE listed as a top tool and resource...

Nancy Pontika

Laworm, an aggregator of scientific online tools addressed mainly to scientists, has listed CORE as a top tool and resource, which helps science to become open and collaborative. Related Links: Tools and Resources to make Science Open and Collaborative

CORE organised and presented two...

Nancy Pontika

During 25 – 27 October OpenMinTeD participated in the FORCE2017 Research Communication and e-Scholarship conference that brings together a diverse group of people interested in changing the way in which scholarly and scientific information is communicated and shared. On Friday October 27th the OpenMinTeD partners held two workshops, one on "How to improve interoperability across publisher platforms to support text and data mining" and another one on "Enhancing the real impact of scholarly publications through text and data mining". At the first workshop the Open University partners from the CORE project presented on the work they have done on the Publisher Connector. This involved surveying the publishers on their machine accessibility interfaces of accessing Open Access content, the creation of the Publisher Connector, a tool that harvests Open Access content from publisher systems and exposes them via the ResourceSync protocol, and the technical expertise directory, where documentation is provided on how harvesting from publisher platforms can be achieved. Read more... Related Links: Original blog post

CORE's Open Access Week 2017

Nancy Pontika

1. DOES YOUR ORGANIZATION HAVE AN OPEN ACCESS STRATEGY? AND HOW ARE YOU IMPLEMENTING IT? CORE is an Open University (OU) project and is jointly funded by the OU and Jisc. CORE is a global full text aggregator of Open Access content harvesting repositories, institutional and disciplinary, and Open Access and Hybrid Journals. Today, the CORE team at the OU runs the CORE service, which is the world's largest aggregator of open access research publications, from repositories and journals systems at a full text level. Currently CORE harvests more than 3700 repositories, 6000 journals and has 80 million metadata records and almost 8.5 million full text. Our mission is to aggregate all Open Access research outputs and make them available to the public. We support the citizens' right to have access to information and we have established a wide set of services for that purpose. All our services are free of cost to the end user and enable them to gain access to Open Access content both in a human and machine readable form and develop their own applications using our content. Read more Related Links: Original blog post

No image available

WOSP2017 - Touchdown Toronto

David Pride

Since 2012, members of KMi's CORE team, headed by Petr Knoth, have orchestrated the WOSP (Workshop On mining Scientific Publications) held each year as a part of JCDL (Joint Conference on Digital Libraries. Previously held in locations as diverse as London and Indianapolis, this year the 6th annual international WOSP workshop took place at the University of Toronto. Over 100 academics joined us to hear presentations from 14 authors on a wide range of topics, from using machine learning to detect academic plagiarism, to using text and data-mining to interrogate a bilingual scientific repository. Our very own Petr Knoth presented new research on Recommender Systems, you can see the slides from this presentation HERE. We also had several demonstrations; Victor Botev gave us a really nice overview of their 'Iris.ai – the Science Assistant' project whilst Ron Daniel from Elsevier presented the 'Content Analytics Toolbench (CAT)' Jevin West Waleed Ammar We had fantastic keynote speeches, from Jevin West (Assistant Professor at the Information School at the University of Washington and co-director of the DataLab ) who introduced us to VizioMetrix, a platform that extracts visual information from the scientific literature. Our second keynote was Waleed Ammar, research team lead at Semantic Scholar, who spoke about their latest work around citation extraction and recommender systems. Researchers can check out their Cite-o-Matic recommender HERE. Many thanks to all our speakers, authors and attendees, hopefully we'll see many of you next year for WOSP 2018! Related Links: Full workshop details here

CORE participates in new EU funded...

Nancy Pontika

FIT4RRI is precisely intended to contribute to bridging the gap between RRI and Open Science and promoting viable strategies to render institutional changes in RFPOs (Research Funding and Performing Organizations) FIT4RRI moves from the assumption that there is a serious gap between the potential role RRI and OS (open science) could play in helping RFPOs (Research Funding and Performing Organizations) to manage the rapid transformation processes affecting science (especially the science-in-society aspects) and the actual impact RRI and OS are having on RFPOs, research sectors and national research systems. The project will act on 2 key factors: Enhancing competencies and skills related to RRI and OS through an improvement of the RRI and OS training offer Institutionally embedding RRI/OS practices and approaches by promoting the diffusion of more advanced governance settings 'Through FIT4RRI we want to engage hard scientists into responsibility matters and promote RRI and OS as drivers for institutional change in research funding and performing organizations. We look at science as a tool to create bridges towards society' Andrea Riccio, Project Coordinator UNIROMA1 CORE has two contributions in this project; it will create the platform to host the RRI resources, training tools and events and will also run an RRI experiment with a focus on the Text and Data Mining. The FIT4RRI project was granted within the Horizon 2020 Program of the European Union after a competitive one stage selection process. The project started with a kick-off meeting in Rome at the Sapienza University on the 12th & 13th of June 2017 and will be funded for three years. Related Links: http://fit4rri.eu/

CORE listed Number 1 in the list of...

Nancy Pontika

An online editing and proofreading company, Scribendi, has recently put together a list of top 21 freely available online databases. It is a pleasure to see CORE listed as Number 1 resource in this list. CORE has been included in this list thanks to its large volume of open access and free of cost content, offering 66 million of bibliographic metadata records and 5 million of full-text research outputs. Our content originates from open access journals and repositories, both institutional and disciplinary, and can be accessed via our search engine. In addition, we also offer an API and Datasets for programmable access to this content, enabling the development of new artificial intelligence-based applications for scientists and for carrying out text and data mining of scientific literature. Related Links: The Top 21 Free Online Journal and Research Databases

CORE now offers 5 millions of open...

Nancy Pontika

CORE, a harvesting service that aggregates open access content from open access journals and repositories from all over the world, currently provides 5 millions of open access full-text papers. "In the last year, we have managed to scale up our harvesting process. This enabled us to significantly increase the amount of open access content we can offer to our users. With more and more open access content being made available by data providers, thanks to recent open access policies, CORE now also captures and provides access to a higher percentage of global research literature ", says CORE's founder, Dr Petr Knoth. With 66 million metadata records and 5 million full-text, from 102 countries, in 52 different languages, CORE becomes now the world's largest full-text open access aggregator. CORE embraces the vibrant collections of both institutional and disciplinary repositories, while its large volume of scholarly outputs ranges from scientific research papers, to grey literature and from Master's to Doctoral thesis. In addition, it is a metasearch for the all the open access peer-reviewed scientific journal articles published in open access journals. CORE's open access collection can be accessed from our search engine (https://core.ac.uk). For those interested in using our data for other purposes, such as building services or applying text and data mining practices, we offer all the data for free via an API and a Dataset. Related Links: CORE

KMi Researchers participated in a case...

Nancy Pontika

The past month the French Association of Directors and Officers of University Libraries and Documentation (ADBU) released a report entitled "Text and Data Mining in Higher Education and Public Research", which mainly explores the UK and French copyright exceptions for text and data mining (TDM). In more detail, the report lists the benefits of text and data mining in scientific research, defines the primary threats in the adoption and practice of TDM, i.e. legal and technical, presents the need for the development of a technical infrastructure, and demonstrates the motivation barriers and the necessary developments in the field. In an effort to understand the level of the TDM adoption and the lack of thereof, the report presents various case studies, one of which is the CORE project. CORE, an aggregation service currently holding around 4.5 million of full-text and 66 million metadata records, has been providing infrastructure for TDM via its main services, namely the CORE API and the CORE Datasets. As the report puts it: "Text-mining at scale cannot take place without infrastructure. Investment is needed in the technologies used to aggregate, normalise, interrogate and preserve TDM materials". CORE's services offer open access content and are provided to everyone free of cost. In addition, CORE is participating at the EU-funded project OpenMinTeD, which aims to create a TDM infrastructure, focusing on legal, technical, policy and interoperability issues, while its role is to act as an open access scientific content provider. Additional to the technical challenges, there are also legal requirements that are creating obstacles and limit the incentives to TDM. Even though there have been amendments both in the UK and the French copyright law, there are still gray areas that prohibit the application of TDM practices among researchers. Furthermore, the legal framework is not harmonised in all countries, while in some of them it does not even exist. The report states that "changes to copyright law must be accompanied by improvements in access, infrastructure, skills and incentives for TDM". In that context, and while CORE is already technically participating in the promotion of TDM, it welcomes all efforts for the advancement of TDM and is open to provide assistantship with the development of new and improvement of existing policies based on its own TDM experience. Related Links: Access the full report

Zdenek's 25 years in KMi!

Chloe Bays

On Thursday November 17th at the Town Meeting, the whole of KMi celebrated Zdenek's 25 years' working at the Open University, within KMi. Petr Knoth led a tribute from Zdenek's team, after which we commemorated the occasion with epicurean style, including a personalised giant cookie and an edible pie-chart made by the Analyse team! We reflected that Zdenek has done some great work at the Open University and 25 years marks an incredible achievement. Throughout his time here, Zdenek has contributed to the fields of Artificial Intelligence, Case Based Reasoning, Design, Knowledge Sharing, Machine Learning and Predictive Modelling. KMi is sure that Zdenek will continue doing great work (hopefully for another 25 years) at the Open University!

KMi Researchers Win the Best Poster...

Petr Knoth

Drahomira Herrmannova and Petr Knoth have won the Best Poster Award at JCDL 2016 in Newark, USA with their contribution "Semantometrics: Towards fulltext-based research evaluation." This was a very good timing as the full experimental report on semantometrics commissioned by Jisc was published in this announcement a week prior the conference. The CORE team at KMi have also organised a successful 5th International Workshop on Mining Scientific Publications (WOSP 2016). The workshop was attended by key people in the area of text and data mining research papers from both Europe and the USA. The workshop featured this year two keynotes. Yuxiao Dong of Notre Dame University gave a talk titled "AMiner: Towards Understanding Big Scholarly Data" and Michael J. Kurtz of Harvard-Smithsonian Centre for Astrophysics presented the "Astrophysics Data System: The Joy of Text". At the workshop, Drahomira Herrmannova also presented a joint long paper with Petr Knoth titled: "An Analysis of the Microsoft Academic Graph." The WOSP workshop was this year sponsored by the OpenMinTeD project in which KMi participates and we invited two speakers on this. Stelios Piperidis of Athena Research Centre gave a talk on "Making sense of scientific textual content" and Peter Mutschke of GESIS presented a discussed in his talk the "Challenges and potential of text mining in scholarly information retrieval."

CORE wins Best Poster Award at the...

Nancy Pontika

Last week, the CORE team attended the 11th Annual Conference on Open Repositories, an international conference addressed mainly to subject and institutional repository managers, focusing on open access, open data and open science tools, projects and services. At the conference the team had six submissions: 1. A workshop presentation on "How can repositories support the text-mining of their content and why?" where Nancy Pontika explained the how repository managers should be supportive of text-mining practices and Petr Knoth described the technical requirements that can enable the text mining of repositories. In addition to that, the CORE team was the workshop organiser, as part of its involvement with the OpenMinTeD project, an EU-funded project on text and data mining. The workshop has been described in two blog posts, one hosted at the OpenMinTeD blog (which includes all workshop presentations), and another post composed by Rebecca Sutton Koeser, a workshop participant. 2. A full presentation on "Exploring Semantometrics: full text-based research evaluation for open repositories" by Petr Knoth. The presentation explored semantometrics, a new class of research evaluation metrics, which builds on the premise that full text is needed to assess the value of a publication. (Presentation available here.) 3. A 24x7 presentation on the "Implementation of the RIOXX metadata guidelines in the UK's repositories through a harvesting service", where Matteo Cancellieri and Nancy Pontika described how the RIOXX metadata guidelines are now a new embedded feature in the CORE Repositories Dashboard. (Presentation slides here.) 4. & 5. Two demo presentations during the Developer Track sessions. The first one was on "Mining Open Access Publications in CORE", where Matteo Cancellieri demonstrated the new CORE API and the second was entitled "Oxford vs Cambridge Contest: Collecting Open Research Evaluation Metrics for University Ranking" where Petr Knoth used the traditional Oxford University vs Cambridge University contest to show how to freely gather and compare the research performance of universities. (The code for both demo presentations is on Github.) 6. A poster on the "Integration of the IRUS-UK Statistics in the CORE Repositories Dashboard", by Samuel Pearce and Nancy Pontika, which showed the process of embedding the existing IRUS-UK statistics service to the CORE Repositories Dashboard. We were delighted also that our poster won the best poster award (yay!). We would like to thank all the conference participants who stopped by our poster, got the CORE freebies and voted for us! (You can access the poster here.) Based on the fact that this conference has a clear focus on repository services and that the CORE service uses or is being used by these services, we were also extensively mentioned in other presentations as well. For example: Richard Jones in his presentation on Lantern mentioned that the project is using the CORE API; Paul Walk described how CORE is using the RIOXX metadata application profile; the Repositories of the Future panel, organised by COAR, stressed on the importance of the role of aggregators in the repository environment specifically naming CORE; and the "Ideas Challenge", a thought-provoking and brainstorming group exercise consisting of programmers and repository managers that focused on how to make the lives of academics easier, proposed CORE as a runner up for the development of a cross-repository journal and topic browse interface. Finally, CORE was also presented in the Jisc poster on "Jisc's Open Access Services". As a genuine open access, open data and open science supporter, CORE is also participating at the EU-funded Facilitate Open Science Training for European Research (FOSTER) project, which also presented a poster on the project's main activities, objectives and the e-learning platform. CORE has built the portal and the e-learning platform. The past week has been a good week for the CORE team. We met our old friends, made new ones, received precious feedback from the community for our services, but more importantly we realised that the CORE service is integral to the repositories community. So, stay tuned with us! Related Links: CORE blog post link

No image available

CORE had 6 proposals accepted at the...

Nancy Pontika

In this year's Open Repositories 2016 Conference, an international conference addressed to the scholarly communications community with a focus on repositories, open access, open data and open science, CORE had 6 items accepted; 1 Paper, 1 Repository Rave presentation, 1 Workshop, 1 Poster and 2 showcases in the Developer Track and Ideas Challenge. In our presentations we will explore topics on semantometrics, text and data mining and the integration of the RIOXX metadata and the IRUS-UK statistics in the CORE Dashboard. In the two developer track sessions we will demonstrate how to freely gather and compare the research performance of universities and how open access publications can be mined from the CORE API respectively. Related Links: Here you can find the summaries of our proposals.

No image available

Responsible Research Metrics

Nancy Pontika

At this year's Jisc DigiFest Dr. Petr Knoth was invited to sit on a panel discussing Responsible Research Metrics. This panel was organised in the context of the recently published Metrics Tide report commissioned by HEFCE, which looked into issues surrounding the use of quantitative research metrics in REF. The other two panelists were Prof. Stephen Curry of Imperial College and Prof. Cameron Neylon of Curtin University. In his talk, Petr argued for the need to develop a range of new research metrics that make use of article full-texts. We call these semantometrics. Petr also stressed that we need to move away from performance measures established axiomatically or ad-hoc without demonstrating their ability to capture aspects of research performance on data. These measures include especially the widely used higher-level metrics, such as the h-index. "We need to move towards data driven approaches to the development of research evaluation metrics" he reiterated. Related Links: Presentation link

Petr Knoth is featured in a free ebook...

Nancy Pontika

The ebook "Text Analytics: 28 Experts Share How to Achieve Business Value" (download page) gives insights into how large industries are exploiting big unstructured data to drive business value. The free eBook was created to demonstrate the benefits of text analytics to a vast array of companies, customer intelligence professionals, and marketers. In this ebook Dr. Petr Knoth discusses how text mining of scientific literature can help reveal meaningful connections, which are hard to discover otherwise. From his experience as a Senior Data Scientist in Mendeley and founder of COnnecting REpositories (CORE), a database that aggregates open access scientific papers, he gives three recommendations for how to successfully apply and derive value from text analytics; 1. the need for an evaluation framework with well-defined metrics, 2. the necessity to collect representative ground truth data and 3. realistic and clear communication with the customer. Knoth states that "text mining has so many application domains, it is absolutely incredible".

Dasha Herrmannova and Petr Knoth...

Drahomira Herrmannova

Dasha and Petr have participated in the challenge which is part of the upcoming Web Search and Data Mining (WSDM) conference. The challenge, coorganised by Microsoft and Elsevier, was to assess the importance of scholarly articles, using data from Microsoft Academic Graph -- a large heterogeneous graph comprised of more than 120 million publications and the related authors, venues, organizations, and fields of study. Dasha and Petr (team called BletchleyPark) were the best out of 32 teams in the training round of the competition and after the validation round were invited to take part in the second phase of the challenge as one of the eight best of the 32 teams. They will also present their method at the workshop in San Francisco, California, in February. Related Links: WSDM Cup Challenge

International recognition of CORE in...

Nancy Pontika

KMi's project COnnecting REpositories (CORE) was included in the June issue of the Best of Business Web newsletter. According to the editor's, Robert Berkman, comment, CORE ... is a real gold mine of a research site. You can perform precision searches by using the advanced search to quickly search via phrase or Boolean; limit by author, publisher and year; choose to only return articles available in fulltext; and search the entire text or limit to those found in the title and abstract. After the list of initial results are returned, you can further refine the list by publication type, language, journal and other fields. CORE also will suggest similar articles and displays these via a visually impressive interactive graph. While only 10% of the items are available in PDF fulltext, even for those that are not, full bibliographic information and an abstract are provided. Consider this site if you are looking for academic and scholarly papers from around the world, including those in languages other than English. The Best of Business Web is a monthly newsletter addressed to market researchers, information professionals, entrepreneurs and business librarians. The newsletter is run by the New School of Public Engagement, New York City, USA. This demonstrates the international attention that CORE receives and its important role in promoting open access to scholarly scientific results.

OU staff find out about Knowledge Media

Kate Dungate

The Open University's Charter Day celebrations concluded with the Learn About Fair yesterday. People from all over the OU found out more about a variety KMi projects, including Engage and EDV. It was a particularly good opportunity for the latter to showcase the Democratic Replay tool, as we quickly approach the UK General Election next month. The two day fair was well attended and we exhibited to several VIP guests. Zdenek was pleased to discuss OU Analyse with Chancellor, Martha-Lane Fox, who showed a real interst in the work behind it. Members of MK:Smart were also given the chance to meet Vice Chancellor Peter Horrocks and Mark Lancaster, MP for North East Milton Keynes. There was clear interest in our technologies from different faculties. We were approached about exhibiting our AR demo and CORE work at an OU conference in June. There was lots of interest in Paul Hogan's AR app promoting MK:Smart. Take a look at it in action in the photo gallery.

Doctor, doctor give me the news...

KMi Reporter

KMi is proud to announce that there are two new doctors in our midst. In a turn up for the books, and after years of hard work, Hassan Saif and Petr Knoth both passed their vivas today. We celebrated with a bottle of bubbly and speeches from each candidate. Both were very thankful for the support they had received from colleagues and friends at KMi. Petr highlighted that "KMi is a great environment to work in," and Hassan commented "When I first joined KMi I dreamed that this day would come."

The OU Welcomes our new Chancellor!

Rachel Coignac-Smith

Today marked an exciting day for the OU, Baroness Martha Lane-Fox was installed as our new Chancellor in the Milton Keynes degree ceremony at Milton Keynes theatre. As part of the days celebration's, KMi were invited to have a stand at the post ceremony showcase at the Walton Hall campus in Milton Keynes. KMi presented our new pipeline technologies in three broad themes: the future of scholarly knowledge; a future of data; and the future of place. Our 'Future Place' theme is significantly wide-reaching to have had its own stand at this showcase, where it showed the new work of MK:Smart via a future vision of Milton Keynes. Similarly, on the KMi stand we presented some 'Place' themed technology, such as your individual library of texts via interactive eTextBooks; and your own laboratory via 'webcasting live and interactive'. The 'Future Scholar' theme showcased two projects: CORE - which is already 'plugged in' to the OU's Open Research Online service (ORO), and represents a vision of the future of open knowledge exchange, as it reads, integrates and shares the world's open texts; Rexplore - which maps the changing shape of scholarly disciplines via research people and publications. The 'Future Data' theme showcased three projects: OUAnalyse - which is creating a dashboard to help us predict and support student success via behavioural data; OUSocial - which reminds us that 'emotion' is a key part of any analysis, and indicates how we might mine social spaces for learning-related emotion; DiscOU - which presents a range of Apps that leverage Linked Data to discover and connect OU resources to the world. We welcomed to be part of this exceptional occasion for both graduates and the wider OU community!

OU Analyse: turning barriers into...

Petr Knoth

At the beginning of August, the OU Analytics team in KMi received the following letter from the Office of the Pro Vice-Chancellor (Academic): "Zdenek and team, At the office of PVC-A Team Away Day I asked colleagues to nominate people who they work with and who deserve a special thank you. The team were nominated for "turning barriers into opportunities and ignoring boundaries to produce a weekly model for predictions whether students will progress. Thank you from me and my team for all you do to support us". It is great to acknowledge the work of the Analytics team in breaking barriers and facilitating the work of the OU Student Support Teams University wide. Well done all! Related Links: The OU Analyse project

KMi Receives 12 Bottles of Quality...

Petr Knoth

Two mysterious boxes of quality champagne have been delivered to KMi on Friday last week. After an original uncertainty about the sender and the true recipient, it has been revealed they were sent by a London based company Research Research Limited. The champagne was addressed to Petr Knoth and his team developing the CORE system as an expression of thanks for producing the service. The company has started using the CORE dataset and services to improve the performance of their classification algorithms, which are applied in production as part of their business. The adoption of CORE helped to dramatically boost their performance indicated by a substantial increase in F-measure. The champagne has been sent as special thanks for the CORE outputs and Petr's help in setting this use case up. The event triggers an interesting question of whether the next REF should use champagne as one of the impact indicators.

KMi receives resources to provide CORE...

Petr Knoth

KMi is to receive resources from Jisc to support 3 full time personnel to continue working on CORE and deliver it as a service. The Jisc decision to continue supporting CORE resulted from a few events. First, the Open Mirror feasibility study commissioned by Jisc last year and published in June 2014 recommended to sustain CORE. In parallel, Jisc asked KMi to create a Service Delivery Plan analysing the costs of sustaining CORE as a service. This Service Delivery Plan served as a basis for consequent negotiations between the OU and Jisc in London in May 2014 about the implication of each service delivery option. The result of this meeting was an agreement on exploring the possibility of delivering CORE as a joint OU and Jisc service starting from the second half of 2015. The meeting also created the basis for the specification of the work to be done from July 2014 to June 2015. This work is financially supported by Jisc and covers 3.2 full-time members of KMi staff. Related Links: Open Mirror Feasibility Study

Strong visibility of KMi's work at...

Petr Knoth

KMi work received high visibility at the 9th International Conference on Open Repositories (OR2014) in Helsinki, Finland especially due to the KMi's CORE project being mentioned on numerous occasions in the talks of non-OU conference participants. OR 2014 is the main conference in the field of open science and open access repositories and attracted this year over 400 attendees. Additionally, a large number of participants attended virtually as sessions were also broadcasted online. KMi's Petr Knoth representing the CORE and FOSTER projects delivered 1 full paper presentation, 2 posters and was also invited to sit on one panel. However, the highlight of the conference from KMi's perspective, was certainly the fact that CORE was discussed in presentations of non-KMi people, sometime even as a key enabling component. The first day of the conference hosted the Open Access Button workshop. OA Button makes individual moments of injustice and frustration in accessing research outputs visible to the world. The workshop chaired by Penny Andrews of Sheffield University discussed the technical issues the implementation of OA Button faced and highlighted the use of CORE as an important component for discovering open access copies of research articles on the Internet. The second day of the conference featured a presentation by Martin Klein of Los Alamos National Laboratory about the HyberActive project. HyberActive provides a pro-active service to archive web references from scholarly articles. A KMi visiting researcher, Dominika Koroncziova and Petr Knoth helped the team in Los Alamos to integrate HyberActive with CORE and set up a demo for OR 2014. During the presentation, Martin Klein showed how CORE sends notifications about new articles and the references extracted from full-texts to their archiving service and described how this simplifies efforts to research data, code and publications management and preservation. In the afternoon, Petr Knoth gave a full-paper presentation titled "My repository is being aggregated: a blessing or a curse?" authored by Petr in collaboration with Lucas Anastasiou and Samuel Pearce. Petr explained how repositories and aggregators need to create a mutually beneficial ecosystem in which usage statistics are shared, while preserving the distributed and open nature of the overall architecture. The next session was a panel organised by the Jisc Repositories Shared Services Project featuring presentations and discussion from a set of services and projects (RIOXX, V4OA projects, RJB, IRUS-UK, SHERPA Services and CORE), which are seen as critical for the UK research infrastructure. One representative from each of these projects was invited to sit on the panel featuring Jisc, the two Jisc centres of excellence EDINA and MIMAS, University of Nottingham and KMi, the Open University. This was a lively 75 minute session attracting a good number of questions from the audience. The long day continued with a poster on the new FOSTER project presented by Eloy Rodriguez and Petr Knoth and a poster on the Jisc RSSP project mentioning CORE and presented by Jisc with the assistance of the service providers. A few posters, such as the London School of Economics poster also mentioned CORE. The last day of the conference has seen a presentation from Richard Jones of Cottage Labs demonstrating the outcomes of the Open Access Repository Registry project funded by Jisc in in which KMi participates. The presentation showed the exchange of data between the new registry and CORE. Overall, this was a very busy, but rewarding week.

Happy 45th Birthday, Open University

KMi Reporter

KMi celebrated The Open University's 45th 'Charter Day' today celebrating the historic signing of The Open University Charter with the launch of the new OU Pipeline, a website dedicated to the delivery of Knowledge Media technologies into the OU. Showing off KMi pipeline technologies at Charter Day 2014, we presented a range of new work in three broad themes: the future of scholarly knowledge; a future of data; and the future of place. Our 'Future Place' theme is significantly wide-reaching to have had its own stand at Charter Day, where it showed the new work of MK:Smart via a future vision of Milton Keynes, the home of the main OU campus. Similarly, on the KMi stand we presented some 'Place' themed technology, such as your individual library of texts via interactive eTextBooks; and your own laboratory via 'webcasting live and interactive'. The 'Future Scholar' theme showcased two projects: CORE - which is already 'plugged in' to the OU's Open Research Online service (ORO), and represents a vision of the future of open knowledge exchange, as it reads, integrates and shares the world's open texts; Rexplore - which maps the changing shape of scholarly disciplines via research people and publications. The 'Future Data' theme showcased three projects: OUAnalyse - which is creating a dashboard to help us predict and support student success via behavioural data; OUSocial - which reminds us that 'emotion' is a key part of any analysis, and indicates how we might mine social spaces for learning-related emotion; DiscOU - which presents a range of Apps that leverage Linked Data to discover and connect OU resources to the world. All in all, some exciting KMi innovations to celebrate 45 years of Open University innovation! Related Links: The KMi Pipeline - OU sign in required, sorry!

KMi asked to feedback on the new HEFCE...

Petr Knoth

A new policy stating that research outputs submitted to post-2014 REF should be Open Access has been announced on Monday, March 31st accompanied with a circular letter to all UK Vice-Chancellors and Principals (see the link below). The policy requires all journal and conference publications with a UK HEI author to be deposited in an institutional or subject repository on acceptance for publication. Publications not compliant with this requirement, including those that are made Open Access only retrospectively, will not be eligible for submission in the next REF exercise. The policy is a fantastic news for all those who support unrestricted access to knowledge for all. It will enable millions of people who have been consistently denied access to research outputs, such as high school students, small and medium enterprises, government and the general public, to get access to the results of research funded by the taxpayer. The policy has been the result of a rigorous consultation process. Input has been also requested from KMi's Petr Knoth and Zdenek Zdrahal. Some of our previous recommendations, particularly those about unrestricted machine access, have been considered by HEFCE as they have been interested in the possibility of using the CORE system, developed in KMi, for monitoring the policy compliance. The last request for feedback has been sent to Petr Knoth and Zdenek Zdrahal a week before the policy announcement together with an early version of the policy. While the policy constitutes certainly a dramatic and positive change, there are still some aspects we hope will be tweaked at a later stage. Many of them relate to the ability to re-use and text-mine research publications at a global scale. The link to our full response is available at the bottom of this page. Related Links: KMi's response to the HEFCE policy The HEFCE Open Access policy document

The financial sustainability of CORE...

Petr Knoth

A study commissioned by the Knowledge Exchange (see the link below), a Danish Agency for Culture supported by a a number of international funders, worked to identify Open Access services across the world that are key to the future of scholarly communication. The aim was to analyse the financial challenges these services face and create a shared strategy that would guarantee their sustainability. The services discussed in the study are used by millions of academics on a daily basis and are already an essential part of the research ecosystem. They include arXiv.org, EPrints, the Public Library of Science (PLoS), the Public Knowledge Project (PKP) and the Directory of Open Access Journals (DOAJ). The CORE system developed in KMi is among these service. When the study was published, the Knowledge Exchange funders organised a workshop in Utrecht, The Netherlands inviting a representative for each of the services. The KMi's Petr Knoth attended the meeting representing CORE. The meeting showed that the key services forming the Open Access ecosystem (Neil Jacob's of Jisc presented the overview of them - see photo) are often in very different situations. Some of them, such as the BASE system, receive a significant financial contribution from their own institution in the spirit of supporting Open Access, while other institutions see these services rather as an opportunity to generate profit to improve their own budget. These institutions do not consider their share of responsibility for the future of scholarly communication or they do not embrace the ideas of openness of research outputs and education for all. Consequently, some of these services, such as DOAJ, already left the academic environment as they can be more efficiently provided through a not-for-profit company. While there was a whole range of issues discussed, the most critical included: - The continuous need of funding: It is very difficult or even impossible for many of these services to charge the end-user a fee as this goes directly against their mission. - The institutional greed: Universities are often not willing to lower the overheads for these services. Libraries are typically not willing to contribute even a small percentage (in the range of 1%) of their commercial services and articles subscription budgets (Elsevier, EBSCO, SciVal, etc) to Open Access services. - Supporting a global service at the local level: Universities and libraries are typically not willing to financially contribute to a service which benefits the whole world, not just them. One of the outcomes of the meeting was that a package of these Open Access critical services should be created and certain funders that distribute money to universities and libraries, such as HEFCE, should mandate a financial contribution to the sustainability of Open Access services. This strategy is now being explored by the Knowledge Exchange. Related Links: Sustainability of Open Access Services Report Phase 1 and 2: Scoping the challenge and consulting the stakeholders

CORE exhibits at the Learn About Fair

Kate Dungate

CORE participated in the annual Learn About Fair at the OU, that took place on 26 February 2014. The CORE team had the chance to present CORE vision to a wide range of visitors including PhD students, associate lecturers, developers and other academic staff. During the fair CORE had the opportunity to demonstrate CORE applications, including the CORE portal search and the mobile application, showcase the core plugin, explain other aspects of the service, such as the CORE API and the repository analytics. Several tutors expressed their interest in using CORE as a research search engine, while a few developers explored the opportunities to make use of CORE API on top of their applications.

CORE becomes the official UK Open...

Petr Knoth

In June 2013, Jisc invited in a closed tender selected teams to bid for the UK National Open Access aggregator. The wide set of requirements included high coverage of UK institutional repositories, the ability to harvest and process data from different repository systems and the availability of a single harmonized API to data stored across UK repositories. The key factors for judging the proposed bids were availability and maturity of existing solutions, satisfying the technical criteria of the tender and the timescale and cost required to meet additional tender criteria. The CORE team, based at KMi, elaborated the proposal based on the existing CORE solutions. The key components had been already implemented and were available from the CORE system. Most of the required services had been in principle included in the currently running DiggiCORE project. Their adoption to meet the tender specification were relatively straightforward. The KMi bid for the UK Open Access Aggregator has been submitted before the 2nd July deadline. In July, Jisc announced that the KMi solution won the tender and CORE will be the UK national Open Access aggregator. KMi has been then asked to prepare and negotiate a concrete project plan with Jisc that on top of the tender asks CORE to network with a number of key stakeholders, in particular, Google Scholar and OpenAIRE. In December 12th, Jisc issued the Grant Letter for UK Aggregation, though the project has started already in October. UK Aggregation is part of the Jisc Repositories Shared Services project. As the UK Open Access aggregator of institutional repositories, CORE provides new opportunities for services built on top of the aggregated content. Apart from supporting text-mining, developers and discoverability of content, CORE also offers opportunities for analysis and monitoring. For example, HEFCE announced that research papers must be immediately after their publication available trough an institutional repository in order to be eligible for post-2014 REF submission. If there is an agreed embargo period for open access availability, the rule takes it into account. CORE as the UK Open Access aggregator can provide all necessary information needed to confirm the compliance of publications with REF rules. The CORE team has developed a pilot application that allows the user to monitor the REF compliance. Petr Knoth and Zdenek Zdrahal were invited to the Workshop on repositories held in the HEFCE office in London on November 22th, which was chaired by Dr Steven Hill, Head of Policy at HEFCE. Petr & Zdenek presented the CORE Compliance Analytics application to the other participants of the workshop. The application can support a range of stakeholders, including researchers, repository managers and HEFCE, in monitoring compliance with respect to the HEFCE Open Access post-2014 REF policy. At present, CORE content consists of 18M+ records with 1.8+ full text, machine readable documents from 612 institutional repositories worldwide. This includes all compliant UK institutional repositories. CORE services are used by OU's ORO and a number of other institutions, including the European Library and UNESCO. In November 2013, CORE content was accessed by 150k+ unique visitors. Related Links: CORE

Prof Beth Plale on the work at Hathi...

Petr Knoth

On December 12th, Prof. Beth Plale, co-director and chair of the HathiTrust Research Center (HTRC) and Indiana University visited KMi. HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge. During her short visit to OU, Prof. Plale gave presentation to the KMi staff and the guests from the OU Library and other OU departments about the organization and the current research activities of HTRC. Before and after her presentation, she discussed the challenges of document aggregation and text mining with Petr Knoth and Zdenek Zdrahal of KMi. Prof. Plale leads the US team in the joint proposal DiscoveryCORE submitted to the third Digging into Data Call. The partners in the DiscoveryCORE bid are Open University - KMi (UK), HTRC, University of Indiana and University of Illinois (US) and the European Library (NL). Related Links: Hathi Trust Research Center

CORE hosted the national UKCoRR meeting

Petr Knoth

On December 3rd 2013, the members' meeting of The United Kingdom Council of Research Repositories (UKCoRR), which is recognised as the main network of professionals supporting the uptake of Open Access in the UK, was hosted by the Knowledge Media Institute. The meeting was visited by about 50 delegates from a wide range of organisations including universities, libraries, not-for-profits (EuroCRIS) and funders (Jisc, HEFCE, EPSRC). The event has been also virtually visited by many, thanks to the meeting being streamed online. The meeting was opened by Prof Peter Scott, who explained that the mission of Open University to deliver more open education and research is well-aligned with the goal of UKCoRR. Peter also discussed the role of KMi within the OU and a number of achievements of KMi in the Open Access to educational resources area. The rest of the meeting was moderated by the UKCoRR Chair, Yvonne Budden of the University of Warwick. In the morning session Yvonne Budden, informed the participants about important UKCoRR activities. In the following presentation Ben Johnson of HEFCE explained the position of HEFCE to Open Access, which currently seems like a game changer. HEFCE requires all research outputs to be submitted to post-2014 REF to be made Open Access in order to be eligible. Five "Lightning talks" then introduced important challenges of Open Access publishing. In the afternoon session Petr Knoth of KMi presented the current state of development of the CORE system, which aggregates OA content from repositories. Zdenek Zdrahal (KMi) then showed how CORE could be used for monitoring OA compliance for post-2014 REF and Loucas Anastasiou (KMi) discussed the issues in OAI-PMH harvesting and how can we overcome them. Chris Biggs (OU Library) demonstrated innovative repository benchmarking in Open Research Online (ORO). In the last presentation, Nicky Whitsed, Director of Library Services, OU summarised the important issues of open access publishing and institutional repositories. Related Links: The United Kingdom Council of Research Repositories UKCoRR CORE

CORE among the top 10 search engines...

Petr Knoth

Using search engines effectively is now a key skill for researchers, but could more be done to equip young researchers with the tools they need? Here, Dr Neil Jacobs and Rachel Bruce from JISC's digital infrastructure team shared their top ten resources for researchers from across the web. CORE was placed among the top 10 search engines that go beyond Google. Related Links: The top ten search engines for researchers that go beyond Google

The visit of HRH The Duke of York to...

KMi Reporter

Today Thursday 23rd May, His Royal Highness Andrew, The Duke of York visited the Open University. His Royal Highness was met by Martin Bean, the Open University Vice-Chancellor and Sir Henry Aubrey-Fletcher, Her Majesty's Lord-Lieutenant for Buckinghamshire. The Duke met with a range of eminent guests and friends of the Open University an unveiled a plaque in the JLB Nexus area to commemorate his visit. During the tour of Open Unversity innovation highlights, the Duke met with KMi Director, Professor Peter Scott, and KMi research student Drahomira Herrmannova. Peter introduced the Duke to our interactive book research and work in iTunes U and provided a perspective on 'post personal computing', and Dasha discussed 'Big Data' and learning analytics research. Other University highlights included a discussion of our new FutureLearn venture, our new Open Educational Resource work in OpenLearn, and the new LTS App 'OU Anywhere'. Related Links: The Duke of York KMI Interactive Books Project

CORE selected one of the Top 100...

Petr Knoth

CORE has been placed among the Top 100 Thesis & Dissertation References on the Web by OnlinePhDProgram.org. The list has been published yesterday. Online Ph.D. Program.org is dedicated to helping future doctoral candidates find the right program that meets their needs, desires, and goals. The site offers helpful blog posts, articles, and a wealth of other information that can answer questions about online Ph.D. programs. Related Links: The Top 100 Thesis Dissertation References on the Web list

Europeana Cloud kicks off under clear...

Petr Knoth

A cloudless sky in the Hague, Netherlands saw on the 4th and 5th March the Europeana Cloud kick-off. The event was visited by about 70 delegates from the partner institutions and also by the chief of the responsible European Commission unit. One of the important tasks of the kick-off was to further discuss the infrastructure requirements that will be used to select and shape the type of the Cloud to be developed. This initial meeting on 4-5 March marked the official start of three years of collaboration between 35 partners. It is a diverse group, including representatives of libraries, research infrastructures, developers, publishers and researchers. They come from many different backgrounds but nevertheless share a common goal of establishing a cloud-based system for Europeana and its aggregators. Europeana Cloud is a €4 million Best Practice Network coordinated by the Europeana Foundation, designed to establish a cloud-based system for Europeana and its aggregators. In Europeana Cloud will be new content, new metadata, a new distributed storage system, new tools and services for researchers and a new platform - Europeana Research. Content providers and aggregators, across the European information landscape, urgently need a cheaper, more sustainable technical infrastructure that is capable of storing both metadata and content. Researchers require a digital space where they can undertake innovative exploration and analysis of Europe's digitised content. Europeana needs to get closer to the target of 30 million items by 2015. KMi is the partner with the second highest number of person month (after Europeana Foundation) out of 33 partners. KMi was invited to the project based on our experience in content aggregation and text-mining acquired in the CORE family of projects. Apart from developing the Cloud specification, reviewing existing Cloud technologies and assessing their suitability for Europeana, KMi will also be responsible for experimenting with different models for identifying semantically related content from a database of around 30 million objects. This technology will be then provided as a service of the Cloud. Related Links: Europeana Cloud Kicks Off Under Clear Skies

KMi achieves excellent results in...

Petr Knoth

The KMI team consisting of Petr Knoth, Drahomira Herrmannova and Zdenek Zdrahal achieves in the NTCIR-10 CrossLink evaluation competition according to the organisers overall best results in the English to Chinese, Japanese and Korean (English to CJK) task and is the top (steadily among the three best and mostly second best) performer in the CJK to English task. Ten international teams took part in the evaluation. This is the second time team KMi participated in this competition. NTCIR is a major forum (similar to TREC) of evaluation workshops designed to enhance research in Information Access (IA) technologies including information retrieval, question answering, text summarization, extraction, etc. The NTCIR-10 conference will take place as usually in Tokyo, Japan this June. The CrossLink task (Cross-Lingual Link Discovery - CLLD) is a way of automatically finding potential links between documents in different languages. It is not directly related to traditional cross-lingual information retrieval (CLIR) because CLIR can be viewed as a process of creating a virtual link between the provided cross-lingual query and the retrieved documents; but CLLD actively recommends a set of meaningful anchors in the source document and uses them as queries with the contextual information from the text to establish links with documents in other languages. Wikipedia is an online multilingual encyclopaedia that contains a very large number of articles covering most written languages and so it includes extensive hypertext links between documents of same language for easy reading and referencing. However, the pages in different languages are rarely linked except for the cross-lingual link between pages about the same subject. This could pose serious difficulties to users who try to seek information or knowledge from different lingual sources. Therefore, cross-lingual link discovery tries to break the language barrier in knowledge sharing. With CLLD users are able to discover documents in languages which they either are familiar with, or which have a richer set of documents than in their language of choice. Related Links: NTCIR-10

CORE at the World Summit on the...

Petr Knoth

The first multi-stakeholder review of the achievements of the World Summit on the Information Society (WSIS+10) entitled "Towards Knowledge Societies for Peace and Sustainable Development" was hosted by UNESCO in Paris from 25 to 27 February 2013. The event attracted about 1000 participants from around the world; about 1500 additional people participated on-line. Zdenek Zdrahal was invited to participate in the high-level roundtable entitled "24. Using E-Science to Strengthen the Interface between Science, Policy and Society". The panel was chaired by the Assistant Director-General for Natural Sciences, Ms Gretchen Kalonji and included seven members (governmental ministers, ambassadors, Head of Digital Science Unit of the European Commission and the Ex-Chief Scientific Advisor for UK Government department). The aim of the roundtable was to explore the opportunities and challenges of using e-Science to support decision making in science policy, to look at the technical requirements for designing a web-based platform to support decision making in science policy and to share experiences gained from developing similar platforms. Zdenek Zdrahal presented the CORE system for aggregating, semantic enrichment and accessing open access scientific papers. The potential of CORE for supporting novel approaches to E-Science was explained. As an example, the conference portal "UNESCO Repository for Connecting Local and International Content (CLIC)" developed for UNESCO by the CORE team at KMi was presented. Documents submitted to the UNESCO conferences through CLIC are semantically enriched and linked to the most similar scientific papers aggregated by CORE from the world open access repositories. Since one of the WSIS+10 hot topics discussed by many participants from governments, industry and academia was "broadband learning", the possibility of using CORE services for projects like Futurelearn were also outlined. The presentation continued by a number of informal meetings with the WSIS+10 participants where the future directions of CORE development and the possibilities of using CORE services were discussed. Related Links: schedule of events, agenda, participants CLIC portal WSIS10 presentation

No image available

KMi to Play a Key Role in Shaping the...

Petr Knoth

The eCloud (Europeana Cloud: Unlocking Europe's Research via The Cloud) project is about to start on the 1st of February 2013. Europeana Cloud is a €4 million project coordinated by the Europeana Foundation, designed to establish a cloud-based system for Europeana and its aggregators. In Europeana Cloud will be new content, new metadata, a new distributed storage system, new tools and services for researchers and a new platform - Europeana Research. Content providers and aggregators, across the European information landscape, urgently need a cheaper, more sustainable technical infrastructure that is capable of storing both metadata and content. Researchers require a digital space where they can undertake innovative exploration and analysis of Europe's digitised content. Europeana needs to get closer to the target of 30 million items by 2015. KMi is the partner with the second highest number of person month (after Europeana Foundation) out of 33 partners. KMi was invited to the project based on our experience in content aggregation and text-mining acquired in the CORE family of projects. Apart from building the eCloud infrastructure, KMi will also be responsible for experimenting with different models for identifying semantically related content from a database of around 30 million objects. This technology will be then provided as a Cloud service. Related Links: Europeana Cloud CORE

No image available

KMi to build the official UK Open...

Petr Knoth

KMi together with University of Nottingham (CRC) and CottageLabs have been awarded a grant in the JISC Digital Infrastructure Programme to build a UK Open Access Repository Registry. KMi was invited to participate in this closed call directly by JISC based on our work in the CORE project. It has already been decided that the resulting software will become an essential component of UK RepositoryNet+ who will guarantee its long-term sustainability. RepNetRegistry will build on our experience supporting OpenDOAR to provide an advanced, data-driven infrastructure maximising the potential for use with 3rd party services, such as aggregators, cross-search tools or multiple-deposit interfaces, by exposing authoritative quality coltrolled data through a RESTful API. In the context of this project, the CORE team will be responsible for collecting and providing repository statistics from across all UK repositories and providing them as authoritative repository benchmarks to the developed Open Access Repository Registry.

OUs full text search system makes huge...

KMi Reporter

The Open University has widened access to academic research material – available through its Open Access search facility CORE– thanks to technical leaps in this innovative system created by the OU's Knowledge Media Institute (KMi). CORE – which stands for Connecting Repositories - has seen unprecedented success in the past year and has more than tripled in size, now offering content from a global network of repositories, freely available to scholars worldwide. CORE – COnnecting REpositories – provides a large easy-to-search database to help academics, researchers and students to find, explore and download research papers. When the service was first launched in 2011 CORE could source material in 60 repositories – today it aggregates data from over 230 internationally plus content from thousands of Open Access journals acquired through the Directory of Open Access Journals (DOAJ). This means the service holds more than nine million metadata items and about half a million full text files. Funding from JISC is permitting the project to develop further analytical processes with DiggiCORE project, which will utilise social media tools. Unlike other Open Access scholarly search systems, CORE also aggregates the full-text files, and not only metadata, and therefore ensures the publication full-texts are freely available for download. Users of commercial academic search systems, such as Google Scholar, can be denied access to the full article, particularly when subscription fees are required. This is often frustrating for scholars. CORE specialises in searches of the full-text items held in approved Open Access repositories, ensuring a vastly improved level of accessibility for users. Anyone searching for full texts on CORE will therefore be able to download all content they discover. CORE offers a unique application interface (API) that makes it possible for others to easily build applications utilising the Open Access content. The CORE API has a lot potential. "For example, it allowed us to build an application that enables people to search for Open Access content from mobile devices or to develop a content recommendation plug in for libraries," says Peter Knoth, the software designer and founder of the CORE system. The reason for CORE's success rise is clear, says Peter: "A huge amount of research papers has been available online as Open Access, but there was limited technical infrastructure that would support different kinds of users in exploiting it. CORE is not only a search system, it is a free platform for developing applications that need access to the full-text of research articles. A very large amount of data is now available through the CORE API. The CORE Linked Open Data repository has this month already grown to 100 million RDF triples making it by far the largest Linked Open Data repository at the Open University. "CORE has created a resource which offers some intriguing possibilities. The API to the aggregation puts this valuable information into the hands of researchers and developers and offers them the chance to use it in new and better ways." says Andy McGregor, the JISC manager of the Resource Discovery programme. "The strength of CORE is that it can be applied in multiple scenarios. In addition to searching for scientific publications, we expect the CORE infrastructure to be used for analytical and research purposes, " says Zdenek Zdrahal, the director of the CORE project. "The CORE platform has become a basis for the development of new services and motivates further research," says Zdenek. In the currently running JISC funded DiggiCORE project, which is a collaboration of the Open University and the European Library, the CORE system is used as a platform for analysing networks of research publications to help better understand the properties of high impact publications and influential authors. But components of the CORE system are also likely to find its use in future projects. CORE is now available for flexible use online and on mobile devices and tablets and is already benefiting journals, scholars, at conferences and as technical support answering the demand for Open Access to academic research papers. Related Links: The CORE project page

iPad University launch in UAE

KMi Reporter

How do you mark the start of the largest experiment to test a nation-wide mobilization of mobile learning in higher education anywhere in the world? On September 25th 2012 His Excellency Sheikh Nahayan Mabarak Al Nahayan, UAE Minister for Higher Education & Scientific Research inaugurated the First Annual Global Mobile Learning Congress 2012 at the United Arab Emirates University in Al Ain. This congress marked the achievement of the Federal Mobile Learning Initiative (initiated only in April 2012), and has set the three Federal higher education institutions of UAE to introduce iPad-based teaching and learning for entering Foundation Program students this September - starting an exciting programme to explore post personal-computer learning at scale. Something like 14,000 students will use Apple's new iPad to "learn different". KMi Director, Peter Scott presented the Open University's perspective on learning in post-personal computer world at the congress as part of a series of international guest presentations helping both celebrate the launch, and help the UAE team to think carefully about how they will be tracking and evaluating its impact as the Foundation students progress in their work. Other Congress speakers included Dr. Ruben Puentedura, the Founder and President of US-based consulting firm Hippasus, which focuses on transformative applications of information technologies to education; Dr. James Ashby, President and Chief of Psychometrics, CORE Edutech, USA, and who is a leading innovator in research-based education designs for elementary, secondary, and higher education; and Apple Distinguished Educator David Baugh recounting the great success of the School in a Box initiative which allows remote schools, villages and towns in places such as Nepal. Federal Mobile Learning Initiative Chairman Dr. Tayeb Kamali noted that this event was just the start of an important an ongoing collaborative project from the three Federal higher education institutions to help boost students' learning outcomes. It is a very exciting time to be learner in the UAE! And yet more exciting to be a teacher in this new world. Related Links: The HCT News story on the iPad Launch

No image available

Drahomira Herrmannova was awarded the...

Petr Knoth

Drahomira Herrmannova was awarded the Prize of Zdena Rabova by the Brno University of Technology. This prize is awarded annually by the dean of the faculty to two students for excellent study and science results. The nomination was supported by Drahomira's diploma thesis, which she wrote during her period at KMi and which was based on paper by Drahomira and Petr Knoth, presented at JCDL conference 2012.

CORE Fight for Open Access in Scotland!

Petr Knoth

The 7th International Conference on Open Repositories (OR 2012) has seen last week close to 500 participants, the highest number in its history. The theme and title of OR 2012 in Edinburgh - Open Services for Open Content: Local In for Global Out - reflects the current move towards open content, 'augmented content', distributed systems and data delivery infrastructures. A very good fit with what CORE (core.kmi.open.ac.uk) offers. The CORE system developed in KMi had a very active presence. Petr Knoth has presented different aspects of the CORE system in a presentation, at a poster session (with Owen Stephens) and also during the developers challenge. CORE has been also discussed in a number of presentations by other participants not directly linked to the Open University. Perhaps the most important case being the UK RepositoryNet+ project presentation. UK RepositoryNet+ is a socio-technical infrastructure funded by JISC supporting deposit, curation & exposure of Open Access research literature. UK RepositoryNet+ aims to provide a stable socio-technical infrastructure at the network-level to maximize value to UK HE of that investment by supporting a mix of distributed and centrally delivered service components within pro-active management, operation, support and outcome. While this infrastructure will be designed to meet the needs of UK research, it is set and must operate effectively within a global context. UK RepositoryNet+ considers the CORE system as an important component in this infrastructure. The similarity of the CORE approach with that of William Wallace, a Scottish hero in the picture, is the determination to fight for freedom. In this case, freedom of access to content. There is, hopefully, also one difference. We wish CORE will not end end up in the same way as William Wallace ... We will see -:) Related Links: OR2012 William Wallace

Yes, we can! - The CORE team organises...

Petr Knoth

KMI and the European Library/Europeana jointly organised the 1st International Workshop on Mining Scientific Publications associated with JCDL 2012 - the most prestigious conference in the world of digital libraries. The workshop was attended by major players in the field including the National Library of Medicine, Library of Congress, CiteSeerX, Elsevier and British Library. Although Barack in the end didn't come, the workshop was very successful, the only problem being the lack of chairs in the room. We (the workshop organisers - Petr Knoth, KMi; Zdenek Zdrahal, KMI and Andreas Juffinger, The European Library/Europeana) were motivated by the positive response of the community to the importance of issues researchers face when mining research publications to improve the way research is carried out and evaluated. A paper authored by Drahomira (aka Dasha) Herrmannova and Petr Knoth (both KMI) entitled 'Visual search for supporting content exploration in large document collections' presented by Dasha during the workshop received encouraging feedback. Another KMI talk was given by Petr who discussed the issues in current digital library aggregation systems, especially those focusing on Open Access, and explained the advantages offered by the CORE system in a presentation titled "COnnecting REpositories (CORE): Aggregating and Enriching Content to Support Open Access." All papers presented at the workshop are available on the workshop web page below. Related Links: 1st International Workshop on Mining Scientific Publications

Cor! It's time for CORE!

Petr Knoth

Is an article published by the University of London Computing Centre featuring certain aspects of the CORE system. Check it out ... Related Links: Cor! It's time for CORE!

New project: UK-wide, educational app...

Fridolin Wild

KMi was contracted by JISC to create EDUKApp, an educational, UK-wide app and widget store. EDUKApp will be both a repository and community site that focuses on collecting and promoting widgets and apps for learning and teaching. The first prototype of EDUKApp was presented at the JISC CETIS, that took place in Nottingham on Feb 22-23 under the motto: "The Future Just Happened? Technology Innovation in Universities and Colleges". KMi researchers Fridolin Wild, Lucas Anastasiou, and Alexander Mikroyannidis used the half-day workshop to introduce to attending academics, governmental-level education officials, researchers, and software evangelists into personal learning environments, widgets, and the new store - with positive reviews and feedback.

Another member of CORE family

Petr Knoth

DiggiCORE is a new two year project funded under the Digging into Data programme, which supports collaboration between the UK, USA, Canada and the Netherlands. The DiggiCORE partnership consists of KMI and The European Library. This makes DiggiCORE the only funded fully European project in the whole programme. The members of the DiggiCORE Advisory Board represent The Open University, SPARC Europe and the Europeana Foundation. The objective of DiggiCORE is to analyse a vast set of research publications from the Open Access domain using natural language processing and social network analysis methods to identify patterns in the behaviour of research communities, to recognise trends in research disciplines, to learn new insights about the citation behaviours of researchers and to discover features that distinguish papers with high impact. The results of this analysis should enable the development of better methods for exploratory search and browsing in digital collections and should encourage new ways of evaluating research or the researcher's impact beyond standard citation measures. To enable the analysis, the DiggiCORE project will extend and improve the CORE system providing access to well-structured and organised information acquired by harvesting, cleaning, integrating and processing information from a very large and fast-growing collection of research publications distributed across more than 1,800 Open Access repositories and many Open Access journals. The Open University's Open Research Online (ORO) is among the harvested institutional repositories. of the The DiggiCORE infrastructure will be freely accessible to the public through a set of web services. Related Links: Eight international research funders announce winners of 2011 Digging into Data challenge DiggiCORE project plan Winners of the Digging into Data programme

ServiceCORE project has started

Petr Knoth

ServiceCORE is a follow up project of CORE funded by JISC. The project aims at developing a nation-wide service for searching, navigating and accessing Open Access publications stored across 143 British institutional repositories. The CORE system is unique in its way to use text-mining and linked data to connect and interlink semantically similar publications at the level of full-texts. Within ServiceCORE this functionality will be extended also to metadata records. The fact that KMI has been funded to extend the current CORE system with new functionalities and to establish it as a British service is a great challenge as well as an acknowledgment of the CORE success and wide impact. Please read this news story to see what JISC says about CORE: http://www.jisc.ac.uk/news/stories/2011/09/openaccess.aspx. The ServiceCORE project benefits from a very strong Advisory Board represented by members of OpenDOAR, UKOLN, MIMAS and The European Library. Related Links: CORE on JISC website CORE portal CORE video

KMi wins the Best Poster/Demo Award at...

Petr Knoth

The KMi submission authored by Petr Knoth, Vojtech Robotka and Zdenek Zdrahal entitled: " Connecting Repositories in the Open Access Domain using Text Mining and Semantic Data" won the Best Poster/Demo Award at the International Conference on Theory and Practise of Digital Libraries (TPDL 2011) which is this week taking place in Berlin, Germany. The European Conference on Research and Advanced Technology for Digital Libraries (ECDL) has been the leading European scientific forum on digital libraries for 14 years. For the 15th year the conference was renamed into: International Conference on Theory and Practice of Digital Libraries (TPDL). Related Links: CORE

No image available

KMI @ NTCIR CrossLink competition

Petr Knoth

The KMI team consisting of Petr Knoth, Lukas Zilka and Zdenek Zdrahal scored first in the NTCIR CrossLink competition in the manual assessment category in A2F P@5. The team placed consistently in the top three in other categories. Twelve international teams took part in the evaluation. NTCIR is a major forum (similar to TREC) of evaluation workshops designed to enhance research in Information Access (IA) technologies including information retrieval, question answering, text summarization, extraction, etc. The NTCIR-9 will take place as usually in Tokyo, Japan this December. The CrossLink task (Cross-Lingual Link Discovery - CLLD) is a way of automatically finding potential links between documents in different languages. It is not directly related to traditional cross-lingual information retrieval (CLIR) because CLIR can be viewed as a process of creating a virtual link between the provided cross-lingual query and the retrieved documents; but CLLD actively recommends a set of meaningful anchors in the source document and uses them as queries with the contextual information from the text to establish links with documents in other languages. Wikipedia is an online multilingual encyclopaedia that contains a very large number of articles covering most written languages and so it includes extensive hypertext links between documents of same language for easy reading and referencing. However, the pages in different languages are rarely linked except for the cross-lingual link between pages about the same subject. This could pose serious difficulties to users who try to seek information or knowledge from different lingual sources. Therefore, cross-lingual link discovery tries to break the language barrier in knowledge sharing. With CLLD users are able to discover documents in languages which they either are familiar with, or which have a richer set of documents than in their language of choice.

First KMi Summit on Linked Data and...

John Domingue

Linked Data, the sharing of data on the Web at large scale through the use of URIs, the HTTP protocol and RDF is gaining uptake at an ever increasing rate. As, commonly known, this technology is now supported by a number of large players including the UK Government, the BBC, Google, Yahoo and to some extent Facebook. In addition to the relative simplicity of the principles and technologies underlying Linked Data a second major reason for its growing popularity is that it is supported by a number of high quality industrial-strength tools. Following, from this we now find ourselves in a situation where a number of KMi related projects are deploying Linked Data here at the OU. This first summit provided an opportunity for these projects to outline what has been achieved thus far and for KMi in general to discuss plans and priorities for future development and deployment. Specifically, the event covered the following projects: • LUCERO - the project which setup data.open.ac.uk the first Linked Data site in the UK (and probably the world) for a higher education establishment. • RADAR - which supports the analysis and management of OU research data thus aiding with the OU REF submission. • CORE - connecting together disparate scientific repositories enabling them to be searched as a single resource. • Annomation and SugarTube - which enable, respectively, the annotation and semantic search over BBC archives. These tools aim to support OU course teams who wish to find relevant video segments related to a specific OU course topic. • UCIAD - which uses Linked Data to support the analysis of user activity across OU systems. Overall the event was very successful, showcasing a number of innovations and also how useful Linked Data already is to the OU business. One of the main action points of the meeting is that KMi will support the creation of a new OU-wide Linked Data portal which will: act as a central repository for all relevant resources; document relevant activities; and act as an OU Linked Data showcase. Related Links: The LUCERO presentation The LUCERO website The RADAR presentation The RADAR website internal OU access only The CORE presentation The CORE website The UCIAD presentation The UCIAD website

KMi in The Times

Petr Knoth

On Wednesday 23rd March 2011, the Eurogene project lead by Dr Zdenek Zdrahal was featured in the printed version of The Times in the article entitled "Gene genie's treasure trove." by Mark Frary. The article discusses, in an interview with Zdenek and Petr, the results and the mission of Eurogene to provide free multimedia learning resources in ten languages for statistical, medical and molecular genetics and to deliver them to students and professionals using KMI technology.

The CORE project started!

Petr Knoth

The COnnecting REpositories (CORE) project has been officially started today by a kick-off meeting in the presence of representatives from JISC, OpenDOAR, UKOLN, OU Library and KMi. The CORE project aims to facilitate the access and navigation across relevant scientific papers stored in Open Access repositories. The project will create a new open metadata repository available in the Linked Data format describing the semantic relatedness between research articles stored across a selection of UK repositories, including the Open University Open Research Online (ORO). This will be achieved by harvesting and processing full-text content using NLP techniques for automatic link discovery. The CORE project will also develop a web-service and a demonstrator client which will allow UK repositories to easily navigate their users to relevant full-text Open Access content stored elsewhere. The usability of this service will be demonstrated on the ORO repository by automatically recommending links to related content in other repositories. CORE will also focus on the development of good practice for the service reuse and uptake in collaboration with UKOLN and OpenDOAR. Related Links: CORE project website

KMi steals the show at EKAW 2010

KMi Reporter

KMi members were very much in evidence at the 17th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2010), which took place in Lisbon on 11-15 October. First held in 1987, EKAW represents the main European forum for research in knowledge technologies. In particular, two prestigious awards were brought home by KMi members. The first one, the prize for best student paper, was won by Fouad Zablith for his paper on "Using Ontological Contexts to Assess the Relevance of Statements in Ontology Evolution", which was written in collaboration with Mathieu D'Aquin, Marta Sabou, and Enrico Motta. Another KMi member, Miriam Fernandez, won the award for Best Poster for her work on "Predicting the quality of semantic relations by applying Machine Learning classifiers", in collaboration with Marta Sabou, Petr Knoth, and Enrico Motta. The influential role of KMi in this research community was also confirmed by the three keynotes given by KMi members in both the main conference and associated workshops. Enrico Motta gave a keynote at the main conference on "Realizing Smart Products" and was also invited speaker at the workshop on Context, Information and Ontologies, while Mathieu d'Aquin gave a keynote at the Personal Semantic Data workshop. Finally papers/posters/demos were also presented by Ning Li, Vanessa Lopez and Stefan Dietze. In sum, EKAW 2010 turned out to be yet another exciting and high profile event which confirmed KMi's international status at the forefront of research and development in knowledge technologies. Related Links: EKAW 2010

GET IN TOUCH

FIND US

Knowledge Media Institute

The Open University

Walton Hall

Milton Keynes

MK7 6AA

United Kingdom


SOCIAL

JOIN US

  • Interns (Information about erasmus & self-funded internships for bachelor, master's, and doctoral degree students)
  • PhD students (funding, topics, application process)
  • Postdoctoral researchers
  • Engineers / developers