[AI 文生图] 提示词收集 - JustPureH2O 的博客

本文用于收集我在上网冲浪时发现并试验过的一些 AI 文生图提示词。

Warning

如无特殊说明，本文均使用 GPT Image2 或 Google NanoBanana 生成对应图像。二者均需科学上网。

本文部分提示词非本人原创，原创提示词可能为 AI 生成，提示词解说由 AI 生成。

立绘转真人+火锅店场景#

Important

为使该提示词正常工作，请将你想要转为真人造型的立绘/二创图作为参考图上传至 AI 平台，上半身无遮挡的参考图效果最佳。

提示词：

1
A casual iPhone snapshot of a female cosplayer recreating EXACTLY the character from the reference image.
2

3
EXTREMELY STRICT identity match: same hairstyle, same twin tails, same hair color gradients, same hair accessories, same elf ears, identical outfit structure, identical character vibe. Face structure MUST strictly follow the original character proportions and features - highly recognizable as the exact character, NOT a generic pretty face
4

5
Face: very attractive but natural anime-realism beauty, visible pores with slight smoothing, light natural makeup, soft blush on cheeks and nose, skin slightly oily and shiny due to hotpot heat, slight redness from warmth, tiny imperfections (light sweat uneven texture), NOT overly perfect
6

7
Expression: she NOTICEs the camera, slight reaction turning toward camera, gives a small casual gesture (like a quick peace sign or sligh smile), expression still natural, not staged eyes briefly looking toward camera but not intensely posing, feels like a quick friendly response, not a photoshoot
8

9
Hair: slightly messy from heat and movement a few strands sticking to face or neck, natural motion blur, slightly flattened from sitting
10

11
Outfit: perfectly faithful to original design real fabric with realistic wrinkles, slightly disordered from sitting and eating, small details slightly shifted
12

13
Pose: slightly turns body toward camera one hand still holding chopsticks or resting on table, other hand casually makes a small gesture (peace sign / slight wave), body posture relaxed and natural, NOT a full pose, just a quick reaction.
14

15
Scene (IMPORTANT): Hotpot restaurant, the cosplayer is sitting at a DIFFERENT TABLE (next table or diagonal), near wall or booth seating, background relatively clean (wall panel / mirror), she is stil eating with her own group
16

17
Environment: cleaner dining area near wall, soft wall lighting, light steam from hotpot, table has meat plates, drinks, sauces.
18

19
Framing (VERY IMPORTANT): feels like taken from YOUR OWN TABLE, subject is NOT centered, slightly zoomed-in, awkward crop, part of body slightly cut off
20

21
Foreground (EXTREMELY IMPORTANT): your own table dominates foreground, hotpot, soup chopsticks, plates clearly visible, edge of table blocking lower frame, your arm or shoulder partially blocking view, another diner slightly blocking frame, foreground slightly out of focus.
22

23
Camera: bad composition, slightly tilted, mino motion blur, focus slightly off, visible grain, JPEG compression artifacts, lens smudge / greasy blur, finger slightly covering corner.
24

25
Lighting: mixed indoor lighting (warm yellow + soft white), slightly uneven exposure, wall lighting softer than central hall, reflections on skin and table surfaces, steam diffusing light. Extra realism: light steam passing in front of subject, subtle background motion blur, minor occlusion (cup / arm / chopsticks blocking area)
26

27
Mood: you are eating normally, you notice a very accurate cosplayer at another table, you zoom in to take a photo, she notices and casually reacts with a small friendly gesture, moment feels spontaneous,slightly interaction, but still not staged
28

29
Style: raw iPhone snapshot, 9:16 vertical, NOT professional NOT staged, natural candid feeling with slight interaction.slight playful vibe, like acknowledging being photographed but still casual and natural

参考图

生成图（其一）

DeepSeek 提示词解析

这是一段为 AI 图像生成模型编写的超写实“伪偷拍”提示词。它的核心目的不是生成一张精美的动漫壁纸，而是生成一张极其逼真、有生活气息、看起来像普通人在火锅店里随手拍到的手机快照。

为了让你更清晰，我把这段提示词拆解成4个核心层级，告诉你它到底在“设计”什么：

第一层：内容核心——极度严格的“角色还原” 提示词要求 AI 生成一个百分百还原某二次元角色的女性 Coser（角色扮演者）。它用“极其严格”来强调：发型、双马尾、发色渐变、发饰、精灵耳、服装结构必须和原图完全一致，且脸型必须符合二次元角色的特定比例（而非千篇一律的网红脸）。这是这张图能被称为“拍到真人了”的灵魂。
第二层：场景与动作——营造“正在吃饭被注意到”的瞬间场景设定在火锅店，Coser 正坐在隔壁桌和自己朋友吃饭。此时她注意到了你的镜头，没有刻意摆拍，而是非常自然地侧身、比个耶或微笑，表情带着火锅热气熏出的红晕和微油光，头发因热气微微凌乱——这个设计是为了营造“她刚发现你在拍她，友好地回应了一下”的互动瞬间，而非写真拍摄。
第三层：构图与前景——制造“偷拍感”的关键（极其重要）这是这段提示词最“心机”的部分。它强制要求主体不要居中、部分身体被裁切，并且你自己的桌子占据绝大部分前景（清晰可见你桌上的火锅、肉片、蘸料）。同时，你的手臂、肩膀，甚至隔壁朋友的胳膊会部分遮挡画面，前景的火锅还是虚焦的。这就营造出“你隔着桌子匆忙掏出手机放大抓拍”的视角。
第四层：画质与瑕疵——故意“拍烂”以增加真实感提示词刻意要求了所有手机拍照的“坏毛病”：构图歪斜、轻微抖动模糊、对焦不准、可见的噪点、JPEG 压缩画质、镜头上有油腻指纹导致的眩光，甚至手指挡住了一角。光线也设定为火锅店混合不均的暖黄+冷白灯管。这一切都是为了模拟真实的 iPhone 随手拍，用“不完美”来反衬“逼真”。固定生成图片为 9:16 竖版，契合手机照片比例。

一句话总结其底层逻辑：提示词在玩一种“高精度还原 + 低质量成像”的反差美学。它强迫 AI 把二次元角色画得极其精准，但又强迫 AI 把照片拍得极其业余。力求让你在看到成图时，产生一种“天啊，竟然真的有这么像的 Coser，还在火锅店被我偶遇到了！”的社交媒体真实感，而不是“这又是 AI 画的一张精致假图”。

提示词可改造之处：

画幅：提示词最后一行，此处 vertical 9:16 代表 9:16 竖版画面；若需要生成横版图片，则将 vertical 改为 horizontal，比例自定。
避免过度解读：提示词第二行，此处列出常见的二次元图片的角色特征，包含“双马尾（twin tails）”“精灵耳（elf ears）”。如参考图角色不包含前述两特征，可删去或修改以防止 AI 过度解读参考图，造成生成图偏离原图。
手势/姿态：提示词“Expression/Scene”一行，可更改为更有指向性的描述。

参考图 PID 145960821
参考图 PID 144513289
参考图 PID 140340698
参考图 PID 120872796

立绘转真人+漫展场景#

Important

为使该提示词正常工作，请将你想要转为真人造型的立绘/二创图作为参考图上传至 AI 平台，细节清晰、无遮挡的全身参考图效果最佳。

Warning

提示词 1 效果不稳定，限额用户使用需谨慎。

提示词 1：

1
Generate a highly detailed photo of a girl cosplaying the uploaded reference illustration, at Comiket. Exactly replicate the same pose, body posture, hand gestures, facial expression. The image you generate should look like a real-world photograph instead of generated work. Add slight imperfection(small wrinkles, uneven skin texture) to the cosplayer's face. Her face must look like a genetic beauty rather than silicon-like doll.
2

3
If the image is clipped and lack details on lower body, then complete the lower body of the female coser you draw please.
4

5
The characters in the picture should be realistic, and taken by a normal camera. Keep the same angle, perspective, and composition, without any deviation

参考图

生成图

DeepSeek 提示词解析

这段提示词的核心目标是：将一张二次元参考插画，精准“转译”成一张在漫展现场拍摄的真实场照。

第一层：构图与姿态的“绝对复刻” 它要求 AI 必须精准复刻参考图中角色的具体姿势、肢体角度、手势和面部表情，且拍摄的视角、透视和构图必须与参考图完全一致，不能有任何偏差。这里有一个特别细节的指令：“如果画面裁剪导致下半身缺失，请自行补全”——这说明你的参考图可能是个半身特写，提示词在强制 AI “脑补”出完整的腿部与身体结构，防止生成断肢或不完整的人物。
第二层：面部质感的“去AI化” 它明确要求面部必须是“基因美”（genetic beauty）而非“硅胶娃娃”（silicon-like doll）。这是在对抗AI最常见的“塑料质感”和“完美对称”。它强制AI在脸上增加真实人类才有的微小瑕疵**（如细小的皱纹、不均匀的皮肤纹理、毛孔），目的是让脸看起来像有血有肉、骨骼分明的真人，而不是光滑无瑕的 3D 建模。
第三层：场景与设备的“实拍设定” 场景为Comiket（漫展），拍摄设备为“普通相机”（normal camera）**。这意味着它要拍出相机的标准透视。同时，它没有强调“偷拍”或“前景遮挡”，说明这是一张正儿八经的、在漫展现场请求对方后拍摄的全身/半身照。

一句话总结其底层逻辑：

它不是在还原一个“瞬间”，而是在证明“这个二次元角色在现实中真实存在”——通过补全身体、拒绝塑料感和严格锁定视角，让 AI 生成一张可以作为“漫展返图”发到社交媒体上的写实照片。

提示词 2（推荐）：

1
一张电影感的、4K 分辨率的纪实风格照片，画面中一位 Coser ，在熙熙攘攘的动漫展会大厅里并排站立。
2

3
Coser 与匹配度 (关键): Coser 穿着一套与展架上角色完美匹配、1:1 屏幕还原度的 Cosplay 服装。Coser 的姿势与参考图姿势完全相同，而不是镜像。每个细节都必须精确匹配：
4

5
姿势： 匹配所有的肢体角度、重心分布、头部倾斜、视线方向和手指形状。
6

7
服装： 精确匹配发型、剪裁、图案、颜色和配饰。
8

9
道具： 所有道具都必须出现，并且握在同一只手中，朝向也与参考图上一致。
10

11
颜色： 调色板严格锁定为原始角色的设计。
12

13
脸部：不允许复制动漫画风，严格按照真实人类脸部特征生成
14

15
场地： 艺术家小巷的摊位风格，背景是中等虚化程度的人群。
16

17
灯光： 来自展会柔光箱的明亮、均匀的灯光。
18

19
机位角度： 平视，略微仰拍。
20

21
镜头与设置： 45mm 镜头，f/3.2 光圈，营造出浅景深效果，主体清晰，背景有漂亮的虚化。无动态模糊。
22

23
整体风格: 照片般逼真，具有最高的清晰度和细节。画面中任何地方都绝对不允许出现品牌

参考图

生成图

Warning

上传平视、正面构图的动漫图全身像效果最佳

参考图 PID 131489130
参考图 PID 117225857

音乐

音乐

立绘转真人+火锅店场景#

立绘转真人+漫展场景#

文章分享

评论区

音乐

文章目录