EasyAnimate
更新: 12/21/2024 字数: 0 字 时长: 0 分钟
概述
EasyAnimate
是一个基于transformer
结构的pipeline,可用于生成AI图片
与视频
、训练Diffusion Transformer的基线模型与Lora模型,我们支持从已经训练好的EasyAnimate模型直接进行预测,生成不同分辨率,6秒左右、fps8的视频(EasyAnimateV5,1 ~ 49帧),也支持用户训练自己的基线模型与Lora模型,进行一定的风格变换。
项目简介
项目信息
在线体验
HuggingFace体验:点击访问
ModelScope体验:点击访问
模型下载
- 如果下载太慢,可以考虑将
https://huggingface.co/
改为https://hf-mirror.com
名称 | 种类 | 存储空间 | Hugging Face | Model Scope | 描述 |
---|---|---|---|---|---|
EasyAnimateV5-7b-zh-InP | EasyAnimateV5 | 22 GB | Link | Link | 官方的7B图生视频权重。支持多分辨率(512,768,1024)的视频预测,支持多分辨率(512,768,1024)的视频预测,以49帧、每秒8帧进行训练,支持中文与英文双语预测 |
EasyAnimateV5-7b-zh | EasyAnimateV5 | 22 GB | Link | Link | 官方的7B文生视频权重。可用于进行下游任务的fientune。支持多分辨率(512,768,1024)的视频预测,支持多分辨率(512,768,1024)的视频预测,以49帧、每秒8帧进行训练,支持中文与英文双语预测 |
EasyAnimateV5-Reward-LoRAs | EasyAnimateV5 | - | Link | Link | 通过奖励反向传播技术,优化了EasyAnimateV5-12b生成的视频,以更好地匹配人类偏好| |
权重放置
📦 models/
├── 📂 Diffusion_Transformer/
│ ├── 📂 EasyAnimateV5-12b-zh-InP/
│ └── 📂 EasyAnimateV5-12b-zh/
├── 📂 Personalized_Model/
│ └── your trained trainformer model / your trained lora model (for UI load)
paper
arxiv: 点击访问
新特性
- 更新到v5版本,最大支持
1024x1024
,49帧
,6s
,8fps
视频生成,拓展模型规模到12B
,应用MMDIT
结构,支持不同输入的控制模型,支持中文与英文双语预测。
- 更新到v4版本,最大支持
1024x1024
,144帧
,6s
,24fps
视频生成,支持文、图、视频生视频,单个模型可支持512到1280任意分辨率,支持中文与英文双语预测。
- 更新到v3版本,最大支持
960x960
,144帧
,6s
,24fps
视频生成,支持文与图生视频模型。
- ModelScope-Sora“数据导演”创意竞速——第三届Data-Juicer大模型数据挑战赛已经正式启动!其使用EasyAnimate作为基础模型,探究数据处理对于模型训练的作用。立即访问竞赛官网,了解赛事详情。
- 更新到v2版本,最大支持768x768,144帧,6s, 24fps视频生成。
- 创建代码!现在支持
Windows
和Linux
。
系统及环境要求
我们已验证EasyAnimate可在以下环境中执行:
Windows 的详细信息
- 操作系统 Windows 10
- python: python3.10 & python3.11
- pytorch: torch2.2.0
- CUDA: 11.8 & 12.1
- CUDNN: 8+
- GPU: Nvidia-3060 12G
Linux 的详细信息
- 操作系统 Ubuntu 20.04, CentOS
- python: python3.10 & python3.11
- pytorch: torch2.2.0
- CUDA: 11.8 & 12.1
- CUDNN: 8+
- GPU:Nvidia-V100 16G & Nvidia-A10 24G & Nvidia-A100 40G & Nvidia-A100 80G
我们需要大约 60GB
的可用磁盘空间,请检查!
EasyAnimateV5-12B的视频大小可以由不同的GPU Memory生成,包括
GPU memory | 384x672x72 | 384x672x49 | 576x1008x25 | 576x1008x49 | 768x1344x25 | 768x1344x49 |
---|---|---|---|---|---|---|
16GB | 🧡 | 🧡 | ❌ | ❌ | ❌ | ❌ |
24GB | 🧡 | 🧡 | 🧡 | 🧡 | ❌ | ❌ |
40GB | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
80GB | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
✅ 表示它可以在"model_cpu_offload"的情况下运行,
🧡代表它可以在"model_cpu_offload_and_qfloat8"的情况下运行,
⭕️ 表示它可以在"sequential_cpu_offload"的情况下运行,
❌ 表示它无法运行。请注意,使用sequential_cpu_offload运行会更慢。
注意
- 有一些不支持
torch.bfloat16
的卡型,如2080ti
、V100
,需要将app.py
、predict文件
中的weight_dtype
修改为torch.float16才可以运行。
EasyAnimateV5-12B
使用不同GPU在25个steps中的生成时间如下:
GPU | 384x672x72 | 384x672x49 | 576x1008x25 | 576x1008x49 | 768x1344x25 | 768x1344x49 |
---|---|---|---|---|---|---|
A10 24GB | 约120秒 (4.8s/it) | 约240秒 (9.6s/it) | 约320秒 (12.7s/it) | 约750秒 (29.8s/it) | ❌ | ❌ |
A100 80GB | 约45秒 (1.75s/it) | 约90秒 (3.7s/it) | 约120秒 (4.7s/it) | 约300秒 (11.4s/it) | 约265秒 (10.6s/it) | 约710秒 (28.3s/it) |
通过Docker安装
- 使用docker的情况下,请保证机器中已经正确安装显卡驱动与CUDA环境,然后以此执行以下命令:
# pull image
docker pull mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easycv/torch_cuda:easyanimate
# enter image
docker run -it -p 7860:7860 --network host --gpus all --security-opt seccomp:unconfined --shm-size 200g mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easycv/torch_cuda:easyanimate
# clone code
git clone https://github.com/aigc-apps/EasyAnimate.git
# enter EasyAnimate's dir
cd EasyAnimate
# download weights
mkdir models/Diffusion_Transformer
mkdir models/Motion_Module
mkdir models/Personalized_Model
# Please use the hugginface link or modelscope link to download the EasyAnimateV5 model.
# I2V models
# https://huggingface.co/alibaba-pai/EasyAnimateV5-12b-zh-InP
# https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh-InP
# T2V models
# https://huggingface.co/alibaba-pai/EasyAnimateV5-12b-zh
# https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh
V5版免安装win-webui版🔑
注意
解压后,路径不要含有中文,路径不要含有中文,路径不要含有中文
说明
- 软件已经过测试,测试平台为
Windows10
和Nvidia-4090
显卡 - 不支持
AMD显卡
及核显
,显存尽量8GB
以上,cuda-12
版本,低显存
或低cuda版本不保证正常使用 - 点此查看自己的显卡相关信息
- 压缩包已包含依赖的环境模型等大文件,无需安装环境,点开即用;
- webUI版使用官方Git源码构建,功能更全面,可直接使用官方运行示例
- 大小:17GB
预览
下载地址
主地址
备用地址
- 密码:4331
V5模型实测
测试环境
- 操作系统:
Win10
- 显卡和CUDA信息:点此查看
文生视频
运行信息
每步(
it
)耗时:3-5s40步耗时:约3分钟
示例1-人物电影风
A woman with blood on her face and a white tank top... A woman with blood on her face and a white tank top looks down and to her right, then back up as she speaks. She has dark hair pulled back, light skin, and her face and chest are covered in blood. The camera angle is a close-up, focused on the woman's face and upper torso. The lighting is dim and blue-toned, creating a somber and intense atmosphere. The scene appears to be from a movie or TV show.
示例2-风景
The camera pans over a snow-covered mountain range...The camera pans over a snow-covered mountain range, revealing a vast expanse of snow-capped peaks and valleys.The mountains are covered in a thick layer of snow, with some areas appearing almost white while others have a slightly darker, almost grayish hue. The peaks are jagged and irregular, with some rising sharply into the sky while others are more rounded. The valleys are deep and narrow, with steep slopes that are also covered in snow. The trees in the foreground are mostly bare, with only a few leaves remaining on their branches. The sky is overcast, with thick clouds obscuring the sun. The overall impression is one of peace and tranquility, with the snow-covered mountains standing as a testament to the power and beauty of nature.
图生视频
示例1
- 运行信息
- 576x576尺寸
- 每步(
it
)耗时:5-7s - 30步约耗时:3分钟
A young woman with beautiful clear eyes and black hair, looking at the camera quietly. With a smile on her face, she opened her mouth slightly and the camera focused on her face. The video quality is very high and the field of view is very clear. High quality, masterpiece, best quality, high resolution, ultra-fine, dreamlike.
示例2
A young woman with beautiful and clear eyes and blonde hair standing and white dress in a forest wearing a crown. She seems to be lost in thought, and the camera focuses on her face. The video is of high quality, and the view is very clear. High quality, masterpiece, best quality, highres, ultra-detailed, fantastic.
ComyUI版
- 还没测试,后续补充......