Background
CASE STUDY

A Software Application to Enable News Contents Collection and Processing for Bias and Reliability Assessment

Designing, building and deploying a Python/Django, Redis and Beautiful Soup application supporting news contents analysis for a research-focused university context

Client Profile & Challenge

University of Ottawa, often referred to as uOttawa, is a bilingual public research university based in Ottawa, Canada. According to Wikipedia - the institution is the largest English-French bilingual university in the world and offers a wide variety of academic programs, administered by ten faculties. The university is a member of the Canadian U15 group of research-intensive universities. The organization enrolls over 35,000 undergraduate and over 6,000 post-graduate students and has a network of more than 195,000 alumni.

The university aims to take research and innovation to a new level by enabling researchers, educators and students to try to better understand and improve the world around us. uOttawa works closely with many researchers from around the world, along with institutional partners and funding agencies both in Canada and overseas.

Ottawa Dialogue is a university-based organization that brings together research and action in the field of dialogue and mediation. Guided by the needs of the parties in conflict, Ottawa Dialogue develops and carries out quiet and long-term, dialogue-driven initiatives around the world. As a complement to its field work, Ottawa Dialogue pursues a rich research agenda focused on conflict analysis, third party dialogue-based interventions, and best practices relating to “Track Two Diplomacy”. In order to bring innovation to the field of monitoring and evaluation and augment the organizational capacity of conflict analysis, OttawaDialogue sought to develop a tool that streamlines the process of data collection, filtering and analysis.The related research projects involve analyzing a vast number of articles from across the web, and determining their bias and reliability along with their position in the google news. The approach enables researchers to gain a better understanding of broader issues surrounding the current state of media on the web in relation to the ever-changing situation on the ground in conflict zones over long periods of time.. Given the amount of data needed to conduct this type of research projects, there arose a need for designing and developing a specialized data collecting and processing software solution.

Solution

With major experience in the application of web technologies and data analysis, SoftKraft was engaged to develop a customized system that automates a number of tasks that are involved in the processing of news web contents and the significant amounts of data involved in the process.

Students, researchers, and educators often rely on news media for their work. Given the massive and fluctuating supply of news and news-like input, it is often difficult to discern which sources are more dependable than others. The tool built was meant to help guide research work by empowering users to discern reliability and bias, assess the contents analysed in terms of bias, identify sources that warrant confidence and stand up to critical review and thus also put the user in a position to produce more trustworthy research output.

Working collaboratively with the Client’s Product Owner, we looked deeply into how the whole process was structured before, examined various possible solutions that could improve it in a non-disruptive way and decided on a specific process design which was to be supported with the software. Technically speaking, the solution was fitted out with some such functionalities as tools for creating and managing search terms lists with multiple customizable parameters, a tool for creating and managing news source websites and determining their reliability and bias as well as a user interface for filtering and managing the information gathered.

All in all, the implementation of the system enables the Client to focus more on the bigger picture and the significant overall improvement of the efficiency of the whole news data collection, processing and analyses.

Features

  • google news and news source websites data collection and processing functionality
  • tools for managing search terms lists with customizable parameters
  • a tool for managing news sources and determining their reliability / bias
  • user interface for filtering and managing the information gathered

Django
Celery
Redis
Selenium
Beautiful Soup

Benefits

  • improved efficiency of news contents collection and processing
  • automation of the news contents processing and analysis (reduced manual work)
  • user interface facilitating the ease-of-use and better user experience

Need help with your project? Contact Us