Can AI Draw Science? A Benchmark for Evaluating Scientific Figure Generation by Text-to-Image and Multimodal Models

🔬 Recherche Scientifique

Can AI Draw Science? A Benchmark for Evaluating Scientific Figure Generation by Text-to-Image and Multimodal Models

arXiv:2606.28406v1 Announce Type: new Abstract: Text-to-image and multimodal generative models are increasingly used to produce scientific figures such as mechanism diagrams, experimental-design schematics, conceptual frameworks, and graphical abstracts. Yet existing image-generation benchmarks (e.g., GenEval, T2I-CompBench, DPG-Bench) evaluate natural images and measure compositionality, object counting, or photorealism. None of them measure what makes a generated scientific figure usable: correct and legible text labels, faithful depiction of entities and their relations, coherent diagrammatic structure, and adherence to disciplinary drawing conventions. We introduce SciDraw-Bench, a benchmark of 32 structured scientific-figure generation tasks spanning eight figure types and ten disciplines, where each task pairs a natural-language prompt with a machine-checkable specification of required labels, relations, components, conventions, and negative constraints. We propose a four-dimensional evaluation protocol: Text Fidelity (OCR-based label recall and character error rate), Semantic Correctness (vision-language-model judging against the specification), Structural Quality, and Convention Adherence, together with a meta-evaluation protocol and a preliminary inter-judge reliability analysis (human-rating validation is ongoing). We evaluate a domain-specific system, SciDraw AI, against representative general-purpose text-to-image models, and outline a code-to-figure baseline as a planned extension. In a pilot over all eight figure types, the domain-specific system substantially outperforms the general-purpose baselines on every dimension and figure type, with the largest gaps on semantic correctness and convention adherence; text fidelity remains the hardest dimension for all systems.

📖 Cet article provient d'une source externe.

🔗 Lire l'article complet sur la source →

220 mots extraits · Source originale


🔥 OFFRE PARTENAIRE

X68HE ATTACK SHARK Magnetic Gaming Keyboard Mechanical Wired for Pro Gaming 0.01mm Rapid Trigg 8000Hz SOCD/Rs 0.125ms 128K Rate

🔥 X68HE ATTACK SHARK Magnetic Gaming Keyboard Mechanical Wired for Pro Gaming 0.01mm Rapid Trigg 8000Hz SOCD/Rs 0.125ms 128K Rate - Une offre exceptionnelle à ne pas manquer ! Cliquez pour découvrir.
✅ Consultez les photos supplémentaires.

✅ Découvrez toutes les caractéristiques.

✅ Vérifiez la disponibilité actuelle.

✅ Consultez les avis des acheteurs.

Posts les plus consultés de ce blog

Comment supprimer son historique Canal ?

Comment mettre un accent à une lettre majuscule À, É, È, Ç, Î, Ô, Û pour Windows

Téléchargements Système & optimisation: 885 logiciels (gratuit)

Roborock’s Q10 S5 Plus robovac is over half off, matching its best price to date

Monstre d’acier : avec une hauteur de 250 mètres, Big Carl est la plus grande grue du monde

Ukraine Says Russian Intelligence Used Fake Support Texts to Steal Messaging Credentials

Windows 11 : Microsoft s'apprête à modifier cet élément historique que vous regardez en permanence

 400+ Photoshop Shortcuts – Adobe Photoshop Shortcut keys