Friday, April 09, 2021

Automatic Text Summarization: use of Artificial Intelligence (AI)

                        



A huge amount of information is generated every day from many sources like news, social media, RSS feeds, and others. It is difficult for one to read and digest all content at the same pace, so this needs to be effectively summarized to be useful. The Automatic Text Summarization has become essential that reduces the large text of a document to a summary but conveying the gist of the whole document.



What is Automatic Text Summarization?


Automatic Text Summarization is an automated process of generating concise and accurate summaries of a given text document without human help while preserving the meaning of the original text document. 

The common Artificial Intelligence-based technologies used in text summarization are Statistical method, Graph theory, Machine Learning, and Deep Learning.


Benefits of Automatic text summarization



● To get the maximum information in minimum time.

● Convert the unstructured textual data into a readable form.

● Help one to keep pace with the vast amount of content generated daily.

● Eliminates the redundant text and brings out the essential information only.

● It covers all the important facts, which might be missed by the human eye, but the software does not miss it.

● It covers all the important facts, which might be missed by the human eye, but the software does not miss it.




Example of text summarization

Original Text is given below

Text and data mining: Copyright issues 

Text and data mining (TDM)

As information professionals, we always focus on providing accurate and timely information to the users in the best possible ways and finding the different techniques to analyze and extract information from unstructured and structured data from various sources. Researchers are always in thirst for data and the latest information on which they can build upon their future research and support their findings. Researchers are able to work upon more and more research content through TDM because through this process large amounts of information can be analyzed electronically. Text and Data Mining has now become an important tool in scientific research and many other domains. From Social sciences, arts, and literature, and to the other Scientific fields, the role of TDM has become essential to extract the structured and unstructured data and analyze it to reach a certain knowledge pattern. Knowledge discovery through Text and Data Mining (TDM) can definitely lead to some revolutionary findings in many fields.

Text and Data Mining (TDM) is a computational process of generating information by extracting and analyzing structured and unstructured data.

Definition:

Article 2(2) of the DSM (Digital Single Market) Directive defines text and data mining (‘TDM’) as:

“any automated analytical technique aimed at analyzing text and data in digital form in order to generate information which includes but is not limited to patterns, trends, and correlations”1.

“Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” (UK Government)

Difference between Text Mining and Data Mining: 

Text Mining is the computational process of extracting and analyzing unstructured data to reach a certain pattern of information.

Data Mining is the computational process of extracting and analyzing structured data to reach a certain pattern of information. 

What about copyrights and legal issues involved in TDM activity?

Text and data mining activities include copying the work, extraction of data, and analysis of data to generate useful information. Copying of any work without the permission of the author or whoever is the owner of work is a violation of copyrights. In this case, the exception to copyright exists which allows the copying of work for non-commercial research. TDM activity is allowed only for that work that is subscribed by the researchers and they have lawful access to that work. 

According to DSM directives, there are two exceptions to the restrictions on copying for TDM.

 According to the DSM directives 3 and 4, 

According to Article 3 TDM is permitted on copyrighted works where the user has lawful access to the protected work.

Lawful access means the rights to read the works which are described as “access to content based on an open access policy or through contractual arrangements between rights holders and research organizations or cultural heritage institutions, such as subscriptions, or through other lawful means”2

Research organizations and cultural heritage institutions involving universities have the primary goal of conducting scientific research and carrying out educational activities are permitted for copying and extraction of data from the copyrighted works if they have “lawful access” to the protected work.  In addition to permitting mining activities Art. 3(2) allows the secure storage and retention of copies of mined works and other subject-matter “for the purposes of scientific research, including for the verification of research results”.

Article 4 permits reproductions of, and extractions from, “lawfully accessible works” for TDM for any purpose. Art. 4 applies only on condition that right holders have not expressly reserved their rights “in an appropriate manner, such as machine-readable means in the case of content made publicly available online”

According to Prof. Matthew Sag in his article that “copying expressive works for non-expressive purposes should not be counted as infringement and must be recognized as fair use.”4 TDM is a good example of nonexpressive use of copyrighted works, as the purpose of TDM is not to read those articles but to reach to certain patterns of information, trends and correlations through the automated analysis of the data in those articles.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3331606

Those contracts and terms of authors and publishers which restrict the researchers’ activity of text and data mining on their protected works without any reason are unenforceable.


Summarized Text is given below:

Text and data mining As information professionals, we always focus on providing accurate and timely information to the users in the best possible ways and finding the different techniques to analyze and extract information from unstructured and structured data from various sources.

Text and Data Mining is a computational process of generating information by extracting and analyzing structured and unstructured data.

It works by copying large quantities of material, extracting the data, and recombining it to identify patterns.” Difference between Text Mining and Data Mining:

Text and data mining activities include copying the work, extraction of data, and analysis of data to generate useful information.

Research organizations and cultural heritage institutions involving universities have the primary goal of conducting scientific research and carrying out educational activities are permitted for copying and extraction of data from the copyrighted works if they have “lawful access” to the protected work.

According to Prof. Matthew Sag in his article that “copying expressive works for non-expressive purposes should not be counted as infringement and must be recognized as fair use.”4 TDM is a good example of nonexpressive use of copyrighted works, as the purpose of TDM is not to read those articles but to reach to certain patterns of information, trends and correlations through the automated analysis of the data in those articles.



Extractive and Abstractive Methods of Automatic Text Summarization



Extractive methods involve the extraction of key phrases and sentences from the original document and stitch together portions of the content to produce a condensed version. It means it creates the summaries by copying and rearranging passages from the original text.


Abstractive Methods uses the Natural Language Processing techniques to create summaries that may have words that are not present in the original document. It means It creates the summaries by generating new phrases, rephrasing, and using the new words that are not present in the original text. It needs extensive natural language processing.




Examples of Some Text Summarization tools:

  • TextSummarization: This Text Summarization API is based on Advanced NLP and Machine Learning technologies and can be used to summarize the URL or document.
  • Text Compactor: It is a free Online Automatic Text Summarization Tool.
  • Summarizebot: It uses Artificial Intelligence, Machine learning, Natural Language Processing, and Blockchain Technologies for text summarization.
  • SMMRY: It is an online tool for summarizing URLs, PDF, and TXT documents.
  • Simplify: It is a Chrome extension that allows you to turn any lengthy article into a summary.
  • Resoomer: It is an online paraphrasing and summarizing tool that works with several languages with many custom settings.
  • Scholarcy: It is an AI-powered article summarizer good for researchers. It summarizes the whole paper with important phrases and contributions. It provides a custom setting that allows the user to choose the number of words, the level of highlighting, and the level of language variation.

Recent advancements:

Many more AI-based techniques for text summarization are being developed across the world. With the recent advancements in deep learning, we are continuously evolving towards more advanced text summarization techniques.

On November 16, 2020, The Allen Institute for Artificial Intelligence launched an abstractive model on its flagship product Semantic Scholar search engine. The AI2 model uses a transformer, a type of neural network architecture, that powers all the major leaps in NLP.


Google Brain’s AI and Imperial College London team has developed a state-of-the-art technique It achieves state-of-the-art results in 12 summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills, and that it shows “surprising” performance on low-resource summarization, surpassing previous top results on six data sets with only 1,000 examples.” (https://venturebeat.com/2019/12/23/google-brains-ai-achieves-state-of-the-art-text-summarization-performance/)



References:











11 comments:

  1. Great thoughts you got there, believe I may possibly try just some of it throughout my daily life.
    artificial intelligence training in chennai

    ReplyDelete
  2. Thanks Adam

    ReplyDelete

  3. To set it up in normal terms, you could impart to that framework as you do with an individual and the framework would cooperate with you like an individual. AI Chat Assistant

    ReplyDelete
  4. Thanks for this good information, I also have information about tips and tricks for gadgets and related to AI. You can visit Tekno Fellas

    ReplyDelete
  5. I think that thanks for the valuabe information and insights you have so provided here.Taskade AI Generators

    ReplyDelete
  6. The information you have posted is very useful. The sites you have referred was good. Thanks for sharing...
    AI-powered Productivity

    ReplyDelete
  7. "I'm curious about the ethical implications of AI-generated art." ai image generator

    ReplyDelete
  8. I just want to let you know that I just check out your site and I find it very interesting and informative..Virtual Companionship

    ReplyDelete
  9. Great content material and great layout. Your website deserves all of the positive feedback it’s been getting. AI Chat

    ReplyDelete
  10. Certainly! Here's a comment that fits your requirements: Fascinating read on the role of AI in text summarization! AI tools like VoiceSphere are revolutionizing the way we interact with large documents by providing user-friendly chat interfaces that streamline complex data into digestible conversations. It's impressive how platforms such as voicesphere.co can distill vast information while preserving context-specific nuances. Delving deeper into AI tools could unveil even greater potentials in summarization technologies. They're not just useful but necessary for staying informed in the digital age. Would love to see more exploration on this topic!

    ReplyDelete
  11. Cool you inscribe, the info is really salubrious further fascinating, I'll give you a connect to my scene. Best ai writing tools

    ReplyDelete

Comments