1 |
SONG Y, ERMON S. Generative modeling by estimating gradients of the data distribution [EB/OL]. (2020-10-10)[2024-01-01]. https://arxiv.org/pdf/1907.05600.
|
2 |
HO J, JAIN A, ABBEEL P.. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 2020, 33, 6840- 6851.
|
3 |
SOHL-DICKSTEIN J, WEISS E, MAHESWARANATHAN N, et al. Deep unsupervised learning using nonequilibrium thermodynamics [C]// International Conference on Machine Learning. 2015: 2256-2265.
|
4 |
RAMESH A, DHARIWAL P, NICHOL A, et al. Hierarchical text-conditional image generation with CLIP latents [EB/OL]. (2022-04-13)[2024-01-01]. https://arxiv.org/abs/2204.06125.
|
5 |
SAHARIA C, CHAN W, SAXENA S, et al.. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 2022, 35, 36479- 36494.
|
6 |
ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 10684-10695.
|
7 |
CRESWELL A, WHITE T, DUMOULIN V, et al.. Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 2018, 35 (1): 53- 65.
|
8 |
ESSER P, ROMBACH R, OMMER B. Taming transformers for high-resolution image synthesis [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12873-12883.
|
9 |
KINGMA D P, WELLING M. Auto-encoding variational bayes [EB/OL]. (2022-12-10)[2024-01-01]. https://arxiv.org/abs/1312.6114.
|
10 |
VAN DEN OORD A, VINYALS O, KAVUKCUOGLU K. Neural discrete representation learning [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6309-6318.
|
11 |
DHARIWAL P, NICHOL A.. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 2021, 34, 8780- 8794.
|
12 |
SCHUHMANN C, VENCU R, BEAUMONT R, et al. LAION-400M: Open dataset of CLIP-filtered 400 million image-text pairs [EB/OL]. (2021-11-03)[2024-01-01]. https://arxiv.org/abs/2111.02114.
|
13 |
SCHUHMANN C, BEAUMONT R, VENCU R, et al.. LAION-5B: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 2022, 35, 25278- 25294.
|
14 |
GEBRU T, MORGENSTERN J, VECCHIONE B, et al.. Datasheets for datasets. Communications of the ACM, 2021, 64 (12): 86- 92.
|
15 |
WANG Z J, MONTOYA E, MUNECHIKA D, et al. DiffusionDB: A large-scale prompt gallery dataset for text-to-image generative models [C]// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023: 893-911.
|
16 |
SCHRAMOWSKI P, BRACK M, DEISEROTH B, et al. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 22522-22531.
|
17 |
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision [C]// International Conference on Machine Learning. 2021: 8748-8763.
|
18 |
NICHOL A, DHARIWAL P, RAMESH A, et al. Glide: Towards photorealistic image generation and editing with text-guided diffusion models [C]// International Conference on Machine Learning. 2022: 16784-16804.
|
19 |
RUIZ N, LI Y, JAMPANI V, et al. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 22500-22510.
|
20 |
KUMARI N, ZHANG B, ZHANG R, et al. Multi-concept customization of text-to-image diffusion [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 1931-1941.
|
21 |
ZHANG E, WANG K, XU X, et al. Forget-me-not: Learning to forget in text-to-image diffusion models [EB/OL]. (2023-03-30)[2024-01-01]. https://arxiv.org/abs/2303.17591.
|
22 |
GANDIKOTA R, MATERZYNSKA J, FIOTTO-KAUFMAN J, et al. Erasing concepts from diffusion models [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 2426-2436.
|
23 |
RANDO J, PALEKA D, LINDNER D, et al. Red-teaming the stable diffusion safety filter [EB/OL]. (2022-11-10)[2024-01-01]. https://arxiv.org/abs/2210.04610.
|
24 |
SCHRAMOWSKI P, TAUCHMANN C, KERSTING K. Can machines help us answering question 16 in datasheets, and in turn reflecting on inappropriate content? [C]// 2022 ACM Conference on Fairness, Accountability, and Transparency. 2022: 1350-1361.
|
25 |
LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning [C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 3045-3059.
|
26 |
EICHENBERG C, BLACK S, WEINBACH S, et al. MAGMA–multimodal augmentation of generative models through adapter-based finetuning [C]// Findings of the Association for Computational Linguistics: EMNLP 2022. 2022: 2416-2428.
|
27 |
GAL R, PATASHNIK O, MARON H, et al.. StyleGAN-NADA: CLIP-guided domain adaptation of image generators. ACM Transactions on Graphics, 2022, 41 (4): 141.
|
28 |
KIM G, KWON T, YE J C. DiffusionCLIP: Text-guided diffusion models for robust image manipulation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 2426-2435.
|
29 |
GEHMAN S, GURURANGAN S, SAP M, et al. RealToxicityPrompts: Evaluating neural toxic degeneration in language models [C]// Findings of the Association for Computational Linguistics: EMNLP 2020. 2020: 3356-3369.
|
30 |
CHANG M, DRUGA S, FIANNACA A J, et al. The prompt artists [C]// Proceedings of the 15th Conference on Creativity and Cognition. 2023: 75-87.
|
31 |
WILLIAMS A, NANGIA N, BOWMAN S R. A broad-coverage challenge corpus for sentence understanding through inference [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018: 1112-1122.
|
32 |
RAFFEL C, SHAZEER N, ROBERTS A, et al.. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020, 21 (1): 5485- 5551.
|
33 |
FAWCETT T.. An introduction to ROC analysis. Pattern Recognition Letters, 2006, 27 (8): 861- 874.
|
34 |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context [C]// Computer Vision–ECCV 2014. 2014: 740-755.
|
35 |
HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local Nash equilibrium [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6629-6640.
|