Article Summarizer Using AI

Article Summarizer Using AI

其他源码

1.0.0

下载

文章 - 符合器 - 使用-AI

基于AI的Web应用程序，可使用先进的自然语言处理（NLP）技术简要摘要。

介绍

文章 - 夏线using-ai是一种Web应用程序，旨在使用NLP汇总冗长的文章。该应用程序允许用户上传自己的文章或使用示例数据以使用生成的AI模型来生成各种样式的摘要。

数据探索

数据集

用于培训和评估的数据集是PubMed摘要数据集。它包括来自PubMed的文章，其中包含用作摘要的相应摘要。

加载数据集：

 from datasets import load_dataset

pubmed_data = load_dataset ( "ccdv/pubmed-summarization" , split = 'train[:1000]' )

初始数据清洁：

删除缺少值的行以确保数据质量。

 pubmed_data = pubmed_data . filter ( lambda x : x [ 'article' ] is not None and x [ 'abstract' ] is not None )

探索性数据分析：
- 检查文章长度和摘要长度的分布。
- 确定数据集中的常见主题和术语。
```
 print ( pubmed_data [ 0 ])  # View the first data entry 
```

模型选择

预处理

文本令牌化：

将文本分成句子和单词以进行详细分析。

 from nltk . tokenize import sent_tokenize , word_tokenize

sentences = sent_tokenize ( article_text )
words = word_tokenize ( sentence )

停止单词删除：

删除不贡献摘要的常见英语单词。

 from nltk . corpus import stopwords

stop_words = set ( stopwords . words ( 'english' ))
words = [ word for word in words if word . lower () not in stop_words ]

柠檬酸：

将单词转换为基本形式。

 from nltk . stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer ()
words = [ lemmatizer . lemmatize ( word . lower ()) for word in words ]

生成模型

API配置：

使用google.generativeai库进行模型生成。

 import google . generativeai as genai
import os

api_key = os . environ . get ( 'your_api_key' )
genai . configure ( api_key = api_key )

模型初始化：
- 设置生成AI模型。
```
 model = genai . GenerativeModel ()
```

模型微调

训练

使用PubMed数据集微调模型，以提高汇总质量。

 # Example pseudo-code for fine-tuning
model . train ( dataset = pubmed_data , epochs = 10 , learning_rate = 0.001 )

提取性摘要

方法

对于提取性摘要，该应用程序使用传统的NLP技术来识别文章中的关键句子，而无需依赖生成模型。

提取性摘要脚本：
重命名提供的extractive_summary.py到app.py并将其移至项目root：
```
mv /mnt/data/extractive_summary.py app.py
```

核心逻辑：

提取性摘要脚本使用统计和启发式方法来识别文本中最重要的句子。

 # Example of extractive summarization
def extractive_summary ( text ):
    # Tokenize the text and rank sentences
    sentences = sent_tokenize ( text )
    # Rank and select key sentences (pseudo-code)
    summary = ' ' . join ( sentences [: 3 ])  # Example: Select first 3 sentences
    return summary

一体化：

将提取性摘要逻辑与烧瓶应用程序集成在一起，以允许用户在生成和提取性摘要之间进行选择。

 @ app . route ( '/summarize' , methods = [ 'POST' ])
def summarize ():
    if 'file' in request . files and request . files [ 'file' ]. filename != '' :
        file = request . files [ 'file' ]
        article_text = file . read (). decode ( "utf-8" )
    else :
        sample_index = int ( request . form [ 'sample' ])
        article_text = pubmed_data [ sample_index ][ 'article' ]

    style = request . form . get ( 'style' , 'brief' )
    summary_method = request . form . get ( 'method' , 'generative' )
    
    if summary_method == 'generative' :
        summary_text = preprocess_and_summarize ( article_text , style )
    else :
        summary_text = extractive_summary ( article_text )

    return render_template ( 'result.html' , original = article_text , summary = summary_text )

评估

使用Rouge或Bleu等指标评估模型的性能。

 from nltk . translate . bleu_score import sentence_bleu

reference = [ reference_summary . split ()]
candidate = generated_summary . split ()
score = sentence_bleu ( reference , candidate )
print ( f'BLEU Score: { score } ' )

Web应用程序开发

后端

烧瓶设置：

初始化烧瓶应用程序并配置登录管理器。

 from flask import Flask
from flask_login import LoginManager

app = Flask ( __name__ )
app . secret_key = 'your_secret_key'
login_manager = LoginManager ( app )

路线和身份验证：

实施登录，注册，摘要和注销的路由。

 @ app . route ( '/login' , methods = [ 'GET' , 'POST' ])
def login ():
    # login logic here
    return render_template ( 'login.html' )

前端

模板：

为用户界面创建HTML模板。

 <!-- templates/index.html -->
< form action =" {{ url_for('summarize') }} " method =" post " enctype =" multipart/form-data " >
    < input type =" file " name =" file " >
    < button type =" submit " > Summarize </ button >
</ form >

用户体验：
- 确保具有清晰说明和反馈的用户友好界面。

安装

先决条件

Python 3.7+
烧瓶
NLTK
生成AI库（例如Google.generativeai）
生成AI的API密钥

步骤

克隆存储库：

git clone https://github.com/yourusername/Article-Summarizer-Using-AI.git

导航到项目目录：
```
 cd Article-Summarizer-Using-AI
```

创建虚拟环境：

python -m venv venv
source venv/bin/activate  # On Windows use `venvScriptsactivate`

安装依赖项：
```
pip install -r requirements.txt
```
设置环境变量：
- 使用您的API键创建.env文件。
```
 your_api_key=<YOUR_GENERATIVE_AI_API_KEY>
```
下载NLTK数据：
该脚本处理下载必要的NLTK数据。

用法

运行应用程序：
```
flask run --port=5001
```
访问该应用程序：
- 请访问http://127.0.0.1:5001在您的浏览器中。
登录/注册：
- 注册一个新帐户或登录现有凭据。
总结文章：
- 上传文本文件或选择要汇总的示例。
查看摘要：
- 汇总文本显示在结果页面上。

感谢您使用文章 - 夏线效果！我们希望您发现它对您的摘要需求有用。

展开

附加信息

版本 1.0.0
类型其他源码
更新时间 2025-03-10
大小 13.53KB
来自于 Github

文章 - 符合器 - 使用-AI

目录

介绍

数据探索

数据集

模型选择

预处理

生成模型

模型微调

训练

提取性摘要

方法

评估

Web应用程序开发

后端

前端

安装

先决条件

步骤

用法

krita ai diffusion

在它的前面

人工智能照片增强器

人工智能创造者

贾斯珀人工智能

外星人人工智能

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

mongo express

Google Dorks

shepherd

mongo express