Phone-based veterinary cardiology AI was always going to get peer-reviewed. The only open questions were which tool, which institution, and which week. On May 13, 2026, the answer came back. The tool is RadAnalyzer. The institution is Texas A&M College of Veterinary Medicine. The week was the second week of May. The journal is PLoS One.

I have been running RadAnalyzer on Roger's films for almost a year. Roger is my senior dog with a documented heart murmur, and the AI has been part of his routine recheck workflow since well before there was peer-reviewed validation to point a colleague at. The new paper tells me what the workflow already suggested. The model is operating inside the inter-observer error band that two trained humans produce when they read the same radiograph. In some breed conformations, the model is more consistent than the human reader it is being compared to.

What the study actually did

Sonya Gordon and colleagues at Texas A&M ran 1,058 client-owned dogs across 80 breeds through both human and machine. Pearson correlation between RadAnalyzer and a trained observer was 0.917 for vertebral heart size and 0.873 for vertebral left atrial size. Mean bias was effectively zero. 0.002 vertebrae for VHS and 0.007 for VLAS. Limits of agreement were tight. Plus or minus 0.85 vertebrae for VHS and roughly half a vertebra for VLAS.

That last number is where the result becomes interesting. Inter-observer variability for VHS has been documented at up to a full vertebra between two trained readers. The model's disagreement with the human reference is structurally smaller than the disagreement humans have with one another.

For a generalist working at 7 PM, with a coughing Cavalier King Charles spaniel in the next room and no cardiology consult available, that is the operative claim. The question is no longer whether the AI is good enough. The question is whether it is consistent enough to make sequential measurements trustworthy across visits.

How the model is actually built

RadAnalyzer is a deep learning enabled web-based and smartphone-optimized application built by RadAnalyzer LLC of Austin. It auto-measures VHS and VLAS on right lateral thoracic radiographs.

The implementation matters for understanding the result. The original validation cycle trained the algorithm on 801 radiographs and tested it on 199. The architecture is a geometrically informed ensemble. The system identifies seven anatomic landmark points on the lateral thoracic radiograph: the cranial aspect of T4 as the reference, the central and ventral aspect of the carina, the ventral most aspect of the cardiac apex, the caudal vena cava, the cranial cardiac waist, and the landmark points for vertebral length measurement.

The clinical innovation in the method is the vertebral length calculation. Traditional VHS uses the length of a single vertebra, T4, as the reference unit. That choice means any imprecision in identifying a single vertebra propagates linearly into the heart size number. RadAnalyzer computes vertebral length as the mean of T4 through T9, which averages out single-vertebra noise. The math is straightforward. Six measurements averaged together produce a more stable reference than one measurement alone.

A clinician uploads an anonymized right lateral. The model identifies the seven landmark points, calculates VHS and VLAS using the averaged vertebral reference, and returns a number. There is no per-clinic license, no per-image fee, no PIMS integration project. The product is free for veterinarians.

A short history of how we got here

VHS as a measurement system was introduced by Buchanan and Bücheler in 1995 and has been the most-used objective heart-size measurement in companion animal cardiology since then. VLAS was developed later, as a radiographic surrogate for left atrial enlargement that does not require echocardiographic equipment most general practices do not own.

Both measurements share a known weakness. Inter and intra-observer variability has been documented at up to one full vertebra between trained readers, with VLAS showing even larger variability than VHS because the absolute measurement is smaller and the rounding error correspondingly larger. The clinical community has known this for years. The fix has been to refer hard cases to boarded cardiologists, which is the right answer when one is available and a constraint when one is not.

AI-driven measurement is the technical fix. RadAnalyzer is not the first attempt. Earlier deep learning models from MetronMind and others have published validation work on smaller cohorts. The Texas A&M paper is the largest single-center validation of a free clinical tool in the category to date, and it puts the limits of agreement in the open literature where other developers and clinicians can argue with it on the data.

The economics matter

Most veterinary AI is sold on the SaaS playbook. Per-clinic seats, per-image pricing, annual contracts. RadAnalyzer ships the inverse. The clinical use case sits inside a browser tab. The monetization comes later, if at all. That positioning has implications for what model accuracy is worth in the marketplace.

The training and operations data is itself a moat. Every time a clinician corrects a landmark point, the model learns. Texas A&M Small Animal Teaching Hospital interns generate verified labels under Gordon's supervision. The labeling pipeline is structurally faster than any commercial vendor can run without a teaching hospital partnership.

The founding story is the kind of thing pitch decks dream of. Tomas Reyes won Aggies Invent at Texas A&M with the prototype. Tabitha Baibos was a vet student at the time. The two co-founded RadAnalyzer LLC and brought Gordon, a board-certified veterinary cardiologist, into the development cycle. The result is a product with clinical pedigree at the founding team level, not bolted on later through advisory hires.

What this changes about the cardiology referral workflow

There is a second-order effect of validated free imaging AI that does not get talked about enough. Boarded veterinary cardiologists are a scarce resource. The American College of Veterinary Internal Medicine reports a small population of practicing cardiology diplomates relative to the patient population. The wait time for a cardiology consult in many regions is measured in weeks, not days.

A peer-reviewed free tool that produces consistent VHS and VLAS numbers changes the triage logic for that scarce resource. The generalist who can produce a credible objective measurement at the time of presentation has more information when deciding whether to refer urgently, refer routinely, or treat empirically and recheck in two weeks. The cardiologist receiving the eventual referral has a documented baseline number to compare against, rather than a verbal "I thought it looked enlarged" handed across at the start of the appointment.

The cardiologist's value-add is not measurement. It is the echocardiogram, the auscultation in person, the integration of the radiograph with the murmur grade and the dog's clinical presentation. Pushing the measurement layer onto a validated tool frees the cardiologist's time for the parts of the consult that actually require the specialty training. The economics of veterinary cardiology referral were already strained in 2024. A free validated tool that reduces the measurement burden is not a threat to cardiologists, it is a tool that makes the existing cardiologist supply go further.

The next paper to watch

The Texas A&M group has signaled that the next validation cycle will compare RadAnalyzer's radiographic measurements directly to echocardiographic measures of cardiac size. That is the gold-standard comparison the current paper does not provide. If the next paper finds that the AI-derived radiographic measurement correlates well with echocardiographic chamber dimensions, the case for using RadAnalyzer as a frontline screening tool gets stronger. If the correlation is weaker than the radiograph-to-radiograph correlation, the case shifts toward "use this for trending the same dog over time, not for diagnostic classification."

Either way, the next paper closes a gap in the evidence base. I will write it up when it lands.

Where the limitations live

The PLoS One paper is a single-center, retrospective, method-comparison study with one reference observer. The authors are explicit. Future studies will need to compare AI-derived radiographic measures against echocardiographic measures of cardiac size. VHS itself is a surrogate. Pearson correlation against a single trained reader is not the same as agreement with the gold standard.

The caveat that matters most for practitioners is image quality. The study selected high-quality radiographs. Field conditions produce films that are rotated, under-penetrated, or include patient motion. The validation does not speak to performance in those cases. The next paper I would like to see is a multi-site study including community general practices, with intentional variation in radiograph quality, comparing RadAnalyzer against a panel of boarded cardiologists rather than a single trained observer.

The other open question is breed-conformation edge cases. Eighty breeds is a wide net, but flat-chested breeds (Bulldog, French Bulldog, Pug) and deep-chested sighthounds (Greyhound, Whippet, Saluki) present different geometric challenges to a landmark-detection model than mesomorphic breeds. The paper reports aggregate performance across all 80 breeds without a per-conformation subgroup analysis. That subgroup analysis is the next question I want answered.

What this means for your practice this week

Three specific things to do with this paper in your hands.

First, if you have not run RadAnalyzer at least once on a film you have already measured by hand, do that this week. The point is calibration. Put your own clinical eye next to the AI number on a familiar case and see how the two compare. The validation paper's limits of agreement give you the expected envelope. If your numbers fall inside it, the tool is doing what the paper says it does.

Second, the sequential recheck workflow is where the value sits. For dogs on cardiac medication, the question is "is the heart bigger or smaller than it was three months ago." Re-measuring by hand introduces reader drift. Running the same anonymized film through the same model removes the human variability from the comparison. This is the workflow change worth adopting this week.

Third, the owner conversation changes. When you explain a borderline VHS reading to an owner, you can now reference a peer-reviewed bias and limit of agreement instead of a clinical impression. "The AI and I read this at 11.2 with a documented inter-method error of less than a vertebra" lands differently than "I think it looks borderline." The owner gets a concrete number with an honest error bar. The clinician keeps the judgment.

The personal frame

Roger is the senior dog in our household. His baseline films are in RadAnalyzer's history. The reason I ran him through the tool before there was peer-reviewed validation is that I wanted a number to compare against future films, not because I trusted the AI to make a diagnostic call. The Texas A&M paper now lets me tell colleagues, with citations attached, what I have been doing in my own household for almost a year.

Pancake and Gigi are cats. RadAnalyzer is currently dog-only. The feline version of this study is what I want to see next. Cats have their own VHS reference ranges and breed-conformation issues, and a feline validation would close the loop for the multi-species household I run at home.

The next twelve months in veterinary imaging AI are about which models pair their peer-reviewed validation with a credible distribution path. RadAnalyzer just answered the validation half of that question. The distribution path is already a free browser tab. The interesting deployment surface in 2026 is not whether AI will read radiographs. It is which AI tools have crossed the peer-reviewed validation bar and which are still operating on vendor whitepapers. As of May 13, RadAnalyzer is on the first side of that line.

Forward this to the colleague who needs it more than you do.

Reply

Avatar

or to participate

Keep Reading