Not Only Text — Image Reuse Detection

Plagiarism beyond text. A growing concern in research

Scholars, teachers, and editors have been accustomed for a long time to the fact that the plagiarism problem concerns mostly cases of improper reuse of text. Those really are the most widespread cases, but this is only the tip of the iceberg of plagiarism. Improper reuse can occur not only with texts but also with data, tables, images, and even ideas.
While plagiarized ideas can barely be subjected to computer-based detection so far, improper use of other elements of research can be defined. A modern, complex solution for plagiarism detection must address at least some of these cases. The most sought-after program tools are the ones capable of detecting image reuse.

The importance of images in modern scientific communication

In modern science, images are the main confirming element in the article. If the readers don’t consider studying supplementary materials (if they even exist), they trust photos, graphical models, and data visualizations to be sure that the proper research has indeed been performed. Western blot images convince the readers that the experiment was indeed undertaken and can lead to the assumption highlighted in the text; microphotographs prove that the researchers have really seen the described cells in the microscope; and plots give the idea of data volume and method of analysis that can be intuitively compared with the text.

In “Sidereus Nuncius” by Galileo and other papers of that era, images were rather an illustrative element. Even “animalculi”, drawn by Antonie van Leeuwenhoek, gave rather an idea of what the newly discovered creatures could look like. Before the invention of photography and modern methods of data analysis, images were just an extension of the text. But now images are necessary to show readers the accuracy and reliability of conclusions. But what if this tool is improperly misused?
This is an astonishingly frequent situation. Elisabeth M. Bik (Stanford, California, USA) and her colleagues have been analyzing image manipulations in scientific papers for a long time, and their results could sow distrust of science in everyone who gets acquainted with them. The results show that image manipulations can be detected in at least 3.8% of all papers and are often represented by duplications.

The need for advanced image detection in research

While checking the papers, the Advacheck system has already found cases where the same image was stated in different scientific articles as a magnesium composite, in another as a zinc composite, and in a third as a sodium alginate composite. Another example is when the same image appears in several scientific papers as a CT scan of a boy and an adult female.

Universities and research institutes need programs that can ensure image integrity and help them avoid publishing papers with image misconduct, which, in turn, can lead to retraction and undesirable publicity. Moreover, the improper reuse of images leads to a legal problem. In contrast to text citation, which can be considered “fair use” given the appropriate reference, reusing an image requires explicit written permission from the copyright holder. Not only the author of the research paper but also the university and the publisher can be held liable for publishing pictures without said permission. Programs that can detect “image plagiarism” or misuse of images are desperately needed.

Advacheck’s role in detecting image reuse and misuse

Nowadays, almost all search engines, powering the world’s search giants like Google and Bing, are capable of searching for images that are similar to each other. But they are practically useless at detecting plagiarized images. These kinds of solutions are far from ideal when it comes to finding exact matches. And that is exactly the challenge when we are talking about image reuse detection. Moreover, web search engines can only search images from a database of previously indexed image files. This means that search robots (also known as “spiders”) constantly scrape the web pages looking for image files in common formats such as “.JPG, *.PNG, or *.WEBP and load them into their search caches if technically possible.

A typical web search service looks for conventionally formatted images already present in its databases. Meanwhile, scientific articles are often published as PDFs, and they cannot be fully indexed by “search spiders.” Thus, if a Google image search finds nothing, it is helpless to detect improper reuse.

Advacheck — our program solution — has its own search spiders, but they index full-text scientific publications databases. The key difference from conventional search engines is that Advacheck downloads full-text PDF files and extracts images from them. Then, if in the examined text it finds a picture that is similar to the one in the database, it provides a link to the source article of the original image. Users can follow the link and verify that the images are identical (or very similar) directly from the program.

The second feature of Advacheck image search is that it is specific to exact matches and doesn’t find pictures with a similar cell or even a similar color grade, like Google does. To render it capable of finding image manipulations within these strict boundaries, we have programmed it to search for specific types of transformations that can be introduced by unfair research and paper mills:

  • rotation;
  • flipping (horizontal or vertical);
  • cropping;
  • color balance change.

This set of features makes Advacheck powerful in finding image reuse. It goes without saying that the program finds only a fact of reuse, and the operating human must then decide whether this case of reuse was legitimate or illegitimate. However, the technical possibility of listing all reused facts (if any) can be highly demanded by all people and institutions whose businesses and activities rely on publishing — even beyond science. A current implementation of Advacheck’s image search engine has already been established in biology and medicine but can also be useful in architecture, engineering, and science consulting — an emerging area of science-related business.

Just at the beginning of the 20th century, science publishing was predominantly text-based, but now images are gaining a more and more crucial role in science communication. Modern science goes visual, so the integrity and uniqueness of images are emerging topics. They have already given birth to plenty of technical solutions. Advacheck keeps pace with new demands and addresses new challenges.

chevron_left
chevron_right
How Do Teachers Check for Plagiarism in Student Papers
Blog

How Do Teachers Check for Plagiarism in Student Papers

Academic integrity and paper originality are the cornerstones of education. These concepts ensure trust, credibility, and the pursuit of knowledge. Plagiarism stands as the act of presenting someone else’s work or ideas as your own. chevron_right
Accused of Plagiarism: How to Prove You Didn’t Plagiarize?
Blog

Accused of Plagiarism: How to Prove You Didn’t Plagiarize?

Have you ever felt that feeling of being falsely accused of plagiarism? It’s a nasty feeling; moreover, if someone faces it unintentionally. Being accused of plagiarism can be shocking and stressful; especially, when you know you didn’t plagiarize. chevron_right
What is Self-Plagiarism? Definition and How to Avoid It
Blog

What is Self-Plagiarism? Definition and How to Avoid It

Self-plagiarism is a familiar term for copywriters who are in the business of writing texts. The term is often applied when someone writes an article on a similar or the same topic. chevron_right
Artificial Intelligence vs. Human Imagination: Can AI Really Eclipse Our Creativity?
Blog

Artificial Intelligence vs. Human Imagination: Can AI Really Eclipse Our Creativity?

Creativity has long been considered a uniquely human trait. These days, however, a new ally has emerged. Artificial intelligence has replaced humans in many areas. chevron_right
How AI Content Detection Works: Behind the Technology
Blog

How AI Content Detection Works: Behind the Technology

The development of digital technologies offered new opportunities to multiple spheres. One of them is content generation. chevron_right
Advacheck and Moodle: A New Chapter in the Fight Against Plagiarism in Educational Institutions
Blog

Advacheck and Moodle: A New Chapter in the Fight Against Plagiarism in Educational Institutions

In October 2024, Advacheck became a certified Moodle partner, integrating its plagiarism detection system into this widely used educational platform. chevron_right
AI Creations and How to Find Them: What Advacheck Can Offer for Detecting AI-Generated Texts
Blog

AI Creations and How to Find Them: What Advacheck Can Offer for Detecting AI-Generated Texts

Recent advancements in artificial intelligence readily go viral, but even among them, ChatGPT has few equals chevron_right
Not Only Text — Image Reuse Detection
Blog

Not Only Text — Image Reuse Detection

Scholars, teachers, and editors have been accustomed for a long time to the fact that the plagiarism problem concerns mostly cases of improper reuse of text chevron_right
SEPLN 2023: a Model for Machine-Generated Text Detection
Blog

SEPLN 2023: a Model for Machine-Generated Text Detection

Last September, Advacheck team participated in the SEPLN 2023 Conference. chevron_right

Experience Advacheck with a 14-day FREE trial!

  • Personal consultation on how the system works.
  • Access to a demo account tailored to your needs.
  • The top-notch plagiarism detection experience is guaranteed.