Tumgik
animunerdery · 1 year
Text
I had a conversation with this accessibility blogger and we dug into various ai that can provide fairly accurate and detailed functional description for those who need it.
While it’s true that text to image exploded in the public limelight, the same technology came from image to text ai designed as accessibility tools.
As the blogger points out, one should definitely check with the original artist before running their work through ai. I personally don’t mind if any of you want to use what’s on this blog to either test out or simply be read by ai. The only caveat is to not make any money off my stuff.
Also, the @accessibleaesthetics blog is a valuable resource for anyone interested in learning more. Do check them out if you’re interested in providing more effective accessibility.
Remember when I talked about how how I wished there was some image-to-text AI instead just the text-to-image AI? Turns out there is!
Tumblr media
This is a screenshot of an image-to-text AI called "clip_prefix_caption," specifically using the model "Coco." And while it's not 100% accurate, it still did a reasonably impressive job with this image. Of course, it kind of makes sense since this photograph is a free-to-use image I pulled off the web, which is almost certainly the kind of stuff this AI was trained on. If we get a type of image very different from what this AI was probably trained on, the results are not nearly as accurate.
Tumblr media
But that's okay, Coco isn't designed for Optical Character Recognition (OCR). If you put this same image into An OCR-focused AI program like Image to Text Converter, you'd get:
Night Vale podcast (zioNightValeRadio A mafia guy who has really misunderstood "make it look like an accident" shouting WHOOPSIE every time he fires the gun. 1:06 PM • 2023-02-10 • 72.4K Views 3,487 Likes 776 Retweets 27 Quotes
Still not perfect, but defintely better than "a man is playing a game on the Nintendo Wii."
But what about art? How do these types of things fair with describing art? Since I'm not 100% clear on what (if any) information about the input images are put into a data set for AI to learn from, I did not want to put just any art in here. And if you play with any of these programs, I strongly encourage you not to put anything in there you don't have explicit permission to use for this.
I got specific permission from @animunerdery to use their drawing of Vinsmoke Sanji for some AI tests:
Tumblr media
I decide to try a few different models too.
clip_prefix_caption (using coco model): A man wearing a tie and a shirt.
Blip: Caption: a black and white drawing of a man wearing a tie
CLIP Interrogator (using ViT-L model): a man in a shirt and tie smoking a cigarette, sanji, fanart ”, short silver hair, boring, lanky, zero - hour, coal, alp
I should note that the last one, while much more detailed, took a lot longer to generate than the other two.
This is by no means exhaustive. If you take a look at the post this image came from, you will find some even more detailed image-to-text AI outputs.
And this isn't even counting image-to-text AI in less open-source projects. Microsoft Word, for example, generates alt text for almost every image you put in a Word document, assuming you're using the current version. The Accessibility Checker will prompt you to check these though, because their accuracy is iffy at best, especially with images that are very far from what was probably in the data sets Microsoft trained its AI on. You can also contribute to that training data set if you want, because Microsoft gives you the option to "donate" any manually-created alt text you add to an image in your document to their database to improve accuracy. It's a case-by-case opt in though, don't worry.
Some screen readers have built-in image-to-text AI as well. For example, sometimes after reading the alt text on an image (be it properly written alt text or the default word "photo" on every image on a Tumblr post without user-added alt text), the VoiceOver (iOS) screen reader will read an additional description it makes using its text-to-image AI. I can't always get it to do this consistently, but after playing around a bit with a version of the Vinsmoke Sanji image that did not have anything but the default "photo" alt text, I got it to give me this:
Adult. Clothing. Illustrations. One.
Not the most helpful. But this technology is still pretty young, and I think it has a lot of potential if used correctly.
66 notes · View notes
animunerdery · 1 year
Text
Tumblr media
Sanjiweek2023 | 5/6 | Kindess / Straw Hat Crew’s Chef
1K notes · View notes
animunerdery · 1 year
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
SanjiWeek2023 1-4
216 notes · View notes
animunerdery · 1 year
Note
hey do you take comissions?
I don’t, but I know plenty of people who do!
@marashi96art does one piece, resident evil, slam dunk, dbz and others. Her work is fantastic, definitely worth commissioning.
@bukojuiced Jay does one piece among other things. His aesthetic is both clean and soft, really lovely stuff.
Cam is my girl who recently just moved and has been having car issues so could definitely use the help. Her work is also fun as hell.
Ali is a student, but her work is fantastic and you should 100% commission her if you can.
Bao is amazing and you should definitely commission her. She does pretty much everything and does it well
Obviously the big names are around as well, but hopefully this is a start. All of these people are deserving and fantastic artists!
7 notes · View notes
animunerdery · 1 year
Text
Tumblr media
🐊🌸
277 notes · View notes
animunerdery · 1 year
Text
Tumblr media
🦅🐊🤡
611 notes · View notes
animunerdery · 1 year
Text
Tumblr media Tumblr media Tumblr media
Happy Valentine’s Day kiddos!
587 notes · View notes
animunerdery · 1 year
Note
To respond to @saltiestgempearl and really to anyone who would like more robust AI image to text tools.
Replicate has a series of image to text tools. Your mileage may vary as most of the tools are optimized for photography and photo hybrids. However, in a pinch they will give you similar information someone who isn’t familiar with one piece would get.
For drawings I have found that Clip Interrogator and Clip Interrogator2 give the best results.
The output from clip interrogator 1 at a fast speed with the openAI option:
a man in a shirt and tie smoking a cigarette, sanji, smoker, subject action: smoking a cigar, he is smoking a cigarette, with cigar, holding cigar, short goatee, goatee, smoking, man from uncle, long tie, smoke :6, necktie, with a business suit on, (smoke), johan liebert, oda non
The output from clip interrogator 2 at a fast speed with max flavors set at 4:
a man in a shirt and tie smoking a cigarette, kentaro miura manga art style, kentaro miura manga style, inspired by Sadamichi Hirasawa, wearing a shirt with a tie, tall anime guy with blue eyes, handsome anime pose, manga style of kentaro miura, anime handsome man, sanji, kentaro miura art style
They’re not perfect descriptions, but if I had to write a functional description, it would be:
Greyscale anime-esque line drawing of scrawny curly browed blond boy in a skinny black tie and rolled up to his elbows white shirtsleeves. His floppy hair parts to his right our left, covering his right eye entirely. On his left, his hair tucks behind his ear leaving his left side of the face clear as his clear left eye, which, like his shirtsleeves, betray a degree of rumpled exhaustion. The swirly part of his eyebrow swirls up and is closer to his nose than ear. He also sports a scraggly little darker colored goatee and a dainty cigarette dangles from his ever so slightly parted lips. Cuz he’s got his right hand in his way too tight black pants pocket and the other one doing who knows what off screen.
(While it would give more information on the image itself, the description would still be meaningless to people unfamiliar to one piece.)
At any rate…
I’m sure people have a spectrum of feelings towards AI, so it’s probably best to ask for permission if you want to use these tools to “read” other artist’s work. I personally don’t mind what you do with things I make, (just don’t make money off it!)
The vast majority of images on the internet have no tags at all, be it alt or title text. However, with these AI tools, hopefully at least the functional description side of things will open up.
hi pls don't use the ALT image option as an extra caption, that's meant to describe images for visually impaired users!
Tumblr media
Ok, so… some thoughts on alt text and visual impairment.
The original purpose of alt texts are indeed to offer the visually impaired an opportunity to experience an image.
However. How does one experience an image? What is the purpose of the image?
So, to reveal a little about myself. I am visually impaired. I have one severely myopic barely functional eye, and the other is an indiscernible soup of color and shape.
From the functional eye, I try to take in whatever minor little detail. From the nonfunctional one, I suppose adaptation is in order, as the visual world no longer has depth, the realms of the other senses intersect along the crossroads of imagination in order to see that which you cannot.
We feel and experience through so called trivialities and minutiae. Onomatopoeia of scritching along finely toothed margins.
Description itself, the thick kind that oozes with the flavor of the experience, is in a way a practice of inducing nostalgia.
My purpose, however, is to offer alternate hints for immersion. Did you miss out on this? Here’s a little something else for you to experience.
With technology, AI can already efficiently offer basic descriptions of images.
As a maker of things, immersion goes beyond a mimesis of that which exists. The experience is the tone, the mood, the absurd little notes in the margins.
212 notes · View notes
animunerdery · 1 year
Text
Tumblr media
Cross guild daddy’s mean business
661 notes · View notes
animunerdery · 1 year
Text
Tumblr media
Birthday present
120 notes · View notes
animunerdery · 1 year
Text
Tumblr media Tumblr media
Dilf doodles
295 notes · View notes
animunerdery · 1 year
Text
Tumblr media
Mugiwara Sportball
67 notes · View notes
animunerdery · 1 year
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
Doodle dump
191 notes · View notes
animunerdery · 1 year
Note
Hey @wellfine . Thanks for the website! It actually sent me down an internet rabbit hole into the whole origin of alt text and its purpose.
Design considerations for accessibility aside, you make a great point that everything can coexist. Alternative adaptations, and literal transliterations can coexist to offer, as you put it, a more “complete” experience.
I probably should have lead with this, for anyone who wants it, feel free to add functional descriptors to anything here!
This blog is for me to put out free silly drawings with silly notes in the margin for anyone who gets a kick of if that kind of thing.
I am a visually impaired person. I am blind in one eye and I live with the constant fear of going completely blind in both eyes.
I understand that some people like functional descriptions. I also understand that some people depend on screen readers.
But as a visual artist who is also visually impaired, facing a very real probability of having to rely on a screen reader at some point in my future, I actually have a gut aversion towards functional description.
My preference would be for that interpretive sculpture because it wouldn’t be a reminder of loss, but rather, a tangible object I would be able to fully appreciate.
Each time I have to think about or worry about potentially losing the other eye, I feel the pit of my stomach fall. And each time I return to the feeling of loss for the eye that simply does not work, grief comes in waves until I focus on that which I still have and that which I have gained.
There probably aren’t a lot of visual artists who have experienced visual loss. And even amongst us we are all unique and have different reactions and points of views.
I’m just here doing silly things for my own amusement because we all cope with loss, impairment, shortcomings in our own way. We can’t cater to everyone, and we won’t, but I hope all of you who took the time to read this, keep trying to find the silly little things to bring joy to your lives.
hi pls don't use the ALT image option as an extra caption, that's meant to describe images for visually impaired users!
Tumblr media
Ok, so… some thoughts on alt text and visual impairment.
The original purpose of alt texts are indeed to offer the visually impaired an opportunity to experience an image.
However. How does one experience an image? What is the purpose of the image?
So, to reveal a little about myself. I am visually impaired. I have one severely myopic barely functional eye, and the other is an indiscernible soup of color and shape.
From the functional eye, I try to take in whatever minor little detail. From the nonfunctional one, I suppose adaptation is in order, as the visual world no longer has depth, the realms of the other senses intersect along the crossroads of imagination in order to see that which you cannot.
We feel and experience through so called trivialities and minutiae. Onomatopoeia of scritching along finely toothed margins.
Description itself, the thick kind that oozes with the flavor of the experience, is in a way a practice of inducing nostalgia.
My purpose, however, is to offer alternate hints for immersion. Did you miss out on this? Here’s a little something else for you to experience.
With technology, AI can already efficiently offer basic descriptions of images.
As a maker of things, immersion goes beyond a mimesis of that which exists. The experience is the tone, the mood, the absurd little notes in the margins.
212 notes · View notes
animunerdery · 1 year
Note
Thanks for this. It definitely let me see how flippant and dismissive my tone can come across as. For that I apologize.
I was being flippant in some ways, (with the lol AI bit), but genuine in others. I honestly don’t think I am the one to provide “functional description”.
Partially out of selfish pretentious reasons, partially because I know my own limitations, and partially because I want to provide the most complete experience that I can possible.
If I were to provide something for braile users, for example, I would rather produce something like a pixel image to “feel”. Is that selfish and pretentious? Probably. Is it also an attempt to be more considerate, and offer something by touch that words alone wouldn’t encompass? I hope yes as well.
I’m pretty terrible with “functional descriptions” not just from a my arrogant assed brain refuses to pov, but from a when I try, it comes off in a “pretentious flippant” way. AI isn’t just some joke, but an actual tool that provides much less emotionally biased “functional description” than I ever could.
Anyway, how I feel doesn’t matter. What does matter is what the original anon, and @wellfine alluded to. What is it that those who rely on screen readers want? What kind of experience would be ideal? What kind of experience is passable?
Has the experience I’ve been providing all this time felt like an insult on injury? Or is there a way to better serve all of our needs?
hi pls don't use the ALT image option as an extra caption, that's meant to describe images for visually impaired users!
Tumblr media
Ok, so… some thoughts on alt text and visual impairment.
The original purpose of alt texts are indeed to offer the visually impaired an opportunity to experience an image.
However. How does one experience an image? What is the purpose of the image?
So, to reveal a little about myself. I am visually impaired. I have one severely myopic barely functional eye, and the other is an indiscernible soup of color and shape.
From the functional eye, I try to take in whatever minor little detail. From the nonfunctional one, I suppose adaptation is in order, as the visual world no longer has depth, the realms of the other senses intersect along the crossroads of imagination in order to see that which you cannot.
We feel and experience through so called trivialities and minutiae. Onomatopoeia of scritching along finely toothed margins.
Description itself, the thick kind that oozes with the flavor of the experience, is in a way a practice of inducing nostalgia.
My purpose, however, is to offer alternate hints for immersion. Did you miss out on this? Here’s a little something else for you to experience.
With technology, AI can already efficiently offer basic descriptions of images.
As a maker of things, immersion goes beyond a mimesis of that which exists. The experience is the tone, the mood, the absurd little notes in the margins.
212 notes · View notes
animunerdery · 1 year
Note
hi pls don't use the ALT image option as an extra caption, that's meant to describe images for visually impaired users!
Tumblr media
Ok, so… some thoughts on alt text and visual impairment.
The original purpose of alt texts are indeed to offer the visually impaired an opportunity to experience an image.
However. How does one experience an image? What is the purpose of the image?
So, to reveal a little about myself. I am visually impaired. I have one severely myopic barely functional eye, and the other is an indiscernible soup of color and shape.
From the functional eye, I try to take in whatever minor little detail. From the nonfunctional one, I suppose adaptation is in order, as the visual world no longer has depth, the realms of the other senses intersect along the crossroads of imagination in order to see that which you cannot.
We feel and experience through so called trivialities and minutiae. Onomatopoeia of scritching along finely toothed margins.
Description itself, the thick kind that oozes with the flavor of the experience, is in a way a practice of inducing nostalgia.
My purpose, however, is to offer alternate hints for immersion. Did you miss out on this? Here’s a little something else for you to experience.
With technology, AI can already efficiently offer basic descriptions of images.
As a maker of things, immersion goes beyond a mimesis of that which exists. The experience is the tone, the mood, the absurd little notes in the margins.
212 notes · View notes
animunerdery · 1 year
Text
Tumblr media
Sleepy
1K notes · View notes