当前位置：首页 > news >正文

网站开发大数据企业营销

news 2025/8/18 20:57:05

网站开发大数据,企业营销,案例学——网页设计与网站建设,珠宝网站开发文章目录一、什么是Pipeline二、查看PipeLine支持的任务类型三、Pipeline的创建和使用3.1 根据任务类型，直接创建Pipeline，默认是英文模型3.2 指定任务类型，再指定模型，创建基于指定模型的Pipeline3.3 预先加载模型，再…

文章目录

一、什么是Pipeline
二、查看PipeLine支持的任务类型
三、Pipeline的创建和使用
- 3.1 根据任务类型，直接创建Pipeline，默认是英文模型
- 3.2 指定任务类型，再指定模型，创建基于指定模型的Pipeline
- 3.3 预先加载模型，再创建Pipeline
- 3.4 使用Gpu进行推理
- 3.5 查看Device
- 3.6 测试一下耗时
- 3.7 确定的Pipeline的参数
四、Pipeline的背后实现

本文为 https://space.bilibili.com/21060026/channel/collectiondetail?sid=1357748的视频学习笔记

项目地址为：https://github.com/zyds/transformers-code

一、什么是Pipeline

将数据预处理、模型调用、结果后处理三部分组装成的流水线，如下流程图
使我们能够直接输入文本便获得最终的答案，不需要我们关注细节

二、查看PipeLine支持的任务类型

from transformers.pipelines import SUPPORTED_TASKS
from pprint import pprint
for k, v in SUPPORTED_TASKS.items():print(k, v)

输出但其概念PipeLine支持的任务类型以及可以调用的
举例输出：

audio-classification {'impl': <class 'transformers.pipelines.audio_classification.AudioClassificationPipeline'>, 'tf': (), 'pt': (<class 'transformers.models.auto.modeling_auto.AutoModelForAudioClassification'>,), 'default': {'model': {'pt': ('superb/wav2vec2-base-superb-ks', '372e048')}}, 'type': 'audio'}

key: 任务的名称，如音频分类
v：关于任务的实现，如具体哪个Pipeline，有没有TF模型，有没有pytorch模型，模型具体是哪一个

三、Pipeline的创建和使用

3.1 根据任务类型，直接创建Pipeline，默认是英文模型

from transformers import pipeline
pipe = pipeline("text-classification") # 根据pipeline直接创建一个任务类
pipe("very good") # 测试一个句子，输出结果

3.2 指定任务类型，再指定模型，创建基于指定模型的Pipeline

注，这里我已经将模型离线下载到本地了

# https://huggingface.co/models
pipe = pipeline("text-classification", model="./models/roberta-base-finetuned-dianping-chinese")

3.3 预先加载模型，再创建Pipeline

rom transformers import AutoModelForSequenceClassification, AutoTokenizer# 这种方式，必须同时指定model和tokenizer
model = AutoModelForSequenceClassification.from_pretrained("./models_roberta-base-finetuned-dianping-chinese")
tokenizer = AutoTokenizer.from_pretrained("./models_roberta-base-finetuned-dianping-chinese")
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)

3.4 使用Gpu进行推理

pipe = pipeline("text-classification", model="./models_roberta-base-finetuned-dianping-chinese", device=0)

3.5 查看Device

pipe.model.device

3.6 测试一下耗时

import torch
import time
times = []
for i in range(100):torch.cuda.synchronize()start = time.time()pipe("我觉得不太行！")torch.cuda.synchronize()end = time.time()times.append(end - start)
print(sum(times) / 100)

3.7 确定的Pipeline的参数

# 先创建一个pipeline
qa_pipe = pipeline("question-answering", model="../../models/models")
qa_pipe

输出
在这里插入图片描述 QuestionAnsweringPipeline

查看定义，会告诉我们这个pipeline该如何使用

class QuestionAnsweringPipeline(ChunkPipeline):"""Question Answering pipeline using any `ModelForQuestionAnswering`. See the [question answeringexamples](../task_summary#question-answering) for more information.Example:```python>>> from transformers import pipeline>>> oracle = pipeline(model="deepset/roberta-base-squad2")>>> oracle(question="Where do I live?", context="My name is Wolfgang and I live in Berlin"){'score': 0.9191, 'start': 34, 'end': 40, 'answer': 'Berlin'}```Learn more about the basics of using a pipeline in the [pipeline tutorial](../pipeline_tutorial)This question answering pipeline can currently be loaded from [`pipeline`] using the following task identifier:`"question-answering"`.The models that this pipeline can use are models that have been fine-tuned on a question answering task. See theup-to-date list of available models on[huggingface.co/models](https://huggingface.co/models?filter=question-answering)."""

进入pipeline，看__call__，查看可以支持的更多的参数
列出了更多的参数

    def __call__(self, *args, **kwargs):"""Answer the question(s) given as inputs by using the context(s).Args:args ([`SquadExample`] or a list of [`SquadExample`]):One or several [`SquadExample`] containing the question and context.X ([`SquadExample`] or a list of [`SquadExample`], *optional*):One or several [`SquadExample`] containing the question and context (will be treated the same way as ifpassed as the first positional argument).data ([`SquadExample`] or a list of [`SquadExample`], *optional*):One or several [`SquadExample`] containing the question and context (will be treated the same way as ifpassed as the first positional argument).question (`str` or `List[str]`):One or several question(s) (must be used in conjunction with the `context` argument).context (`str` or `List[str]`):One or several context(s) associated with the question(s) (must be used in conjunction with the`question` argument).topk (`int`, *optional*, defaults to 1):The number of answers to return (will be chosen by order of likelihood). Note that we return less thantopk answers if there are not enough options available within the context.doc_stride (`int`, *optional*, defaults to 128):If the context is too long to fit with the question for the model, it will be split in several chunkswith some overlap. This argument controls the size of that overlap.max_answer_len (`int`, *optional*, defaults to 15):The maximum length of predicted answers (e.g., only answers with a shorter length are considered).max_seq_len (`int`, *optional*, defaults to 384):The maximum length of the total sentence (context + question) in tokens of each chunk passed to themodel. The context will be split in several chunks (using `doc_stride` as overlap) if needed.max_question_len (`int`, *optional*, defaults to 64):The maximum length of the question after tokenization. It will be truncated if needed.handle_impossible_answer (`bool`, *optional*, defaults to `False`):Whether or not we accept impossible as an answer.align_to_words (`bool`, *optional*, defaults to `True`):Attempts to align the answer to real words. Improves quality on space separated langages. Might hurt onnon-space-separated languages (like Japanese or Chinese)Return:A `dict` or a list of `dict`: Each result comes as a dictionary with the following keys:- **score** (`float`) -- The probability associated to the answer.- **start** (`int`) -- The character start index of the answer (in the tokenized version of the input).- **end** (`int`) -- The character end index of the answer (in the tokenized version of the input).- **answer** (`str`) -- The answer to the question."""

如下面的例子

我们输出问题：中国的首都是哪里？给的上下文是：中国的首都是北京

qa_pipe(question="中国的首都是哪里？", context="中国的首都是北京")

在这里插入图片描述

如果通过 max_answer_len参数来限定输出的最大长度，会进行强行截断

qa_pipe(question="中国的首都是哪里？", context="中国的首都是北京", max_answer_len=1)

在这里插入图片描述

四、Pipeline的背后实现

step1 初始化组件，Tokenizer，model

# step1 初始化tokenizer， model
tokenizer = AutoTokenizer.from_pretrained("../../models/models_roberta-base-finetuned-dianping-chinese")
model = AutoModelForSequenceClassification.from_pretrained("../../models/models_roberta-base-finetuned-dianping-chinese")

step2 预处理

# 预处理，返回pytorch的tensor，是一个dict
input_text = "我觉得不太行！"
inputs = tokenizer(input_text, return_tensors="pt")
inputs

在这里插入图片描述

step3 模型预测

res = model(**inputs)
res

在这里插入图片描述
预测的结果，包括的内容有点多，如loss,logits等

step4 结果后处理

logits = res.logits
logits = torch.softmax(logits, dim=-1)
pred = torch.argmax(logits).item()
result = model.config.id2label.get(pred)
result

在这里插入图片描述

查看全文

http://www.15wanjia.com/news/124667.html

网站后缀武汉seo关键词排名

大连手机自适应网站建设报价信息流优化师发展前景

广州番禺网站建设工作室关键词自动优化

哪个网站做兼职猎头曹操博客seo

大渡口网站建设免费网站排名优化在线

搜索引擎是如何判断网站的结构新乡seo顾问

做竞争小的网站网站流量分析的指标有哪些

如何做返利网站安装百度到桌面

做首饰网站搜索词热度查询

品牌免费网站建设长沙有实力的关键词优化价格

模板之家免费下载方法搜索引擎优化效果

做游戏排行榜的网站模板电商产品推广方案

专门做网站推广的平台我在百度下的订单如何查询

产品介绍网站设计怎么做百度推广一个月多少钱

文章目录

一、什么是Pipeline

二、查看PipeLine支持的任务类型

三、Pipeline的创建和使用

3.1 根据任务类型，直接创建Pipeline，默认是英文模型

3.2 指定任务类型，再指定模型，创建基于指定模型的Pipeline

3.3 预先加载模型，再创建Pipeline

3.4 使用Gpu进行推理

3.5 查看Device

3.6 测试一下耗时

3.7 确定的Pipeline的参数

四、Pipeline的背后实现

相关文章：