Too Hot or Too Cold? UConn Researcher Finds ‘Goldilocks Problem’ in Child Welfare Decision-Making

A major tool widely used in child welfare decision-making - and the way agencies try to implement it - may be hindering social workers.

A social worker visiting with a young family, the type of situation where a common diagnostic tool is most important, but can also be easily misused, according to new research.

An assessment tool commonly used in social work can be a hindrance when applied too rigidly or too loosely, according to researchers (Adobe Stock).

When something bad happens to a child, the public and policy response is swift and forceful.

How could this have happened?

What went wrong?

What do we do to make sure it never happens again?

When a family becomes erroneously or unnecessarily enmeshed in the child welfare system, that burden is largely invisible – a burden borne mostly by the family itself.

In both situations, the fault for the systemic failure is often placed on the caseworker – overburdened, under-resourced, and forced to make quick and critical judgments about the risk of harm or neglect to children.

But, according to new study coauthored by a researcher in the UConn School of Social Work, a major tool used in child welfare decision-making – and the way agencies try to implement it – may be part of the problem.

“I think it seems appealing to have a consistent way to do something,” says Megan Feely, an assistant professor of social work who specializes in child welfare and child maltreatment prevention. “It’s when you get into the details that it becomes kind of murky.”

In their study – recently published in the journal, Social Service Review – Feely and coauthor Emily Bosk, an assistant professor at the Rutgers University School of Social Work, examined the application of the Structured Decision-Making Model’s Risk Assessment in two states.

“What to do with children who need to be safe, and families who may need help keeping their children safe, seem like some of the most important decisions a state will make,” says Feely, “and it’s really sort of shocking how little attention has been given to how these decisions are made – these  incredibly, incredibly important decisions.”

Commonly called the “RA,” the risk assessment is an actuarially-based prognostic tool that provides a checklist for child welfare workers to use to help assess a family’s future risk. It asks questions — Has the family been involved in child welfare before? Have they had an allegation of neglect? Does the primary caretaker have a substance use problem? Do they have a current or past mental health problem? Are the children medically fragile? – and then categorizes the family as low, medium, high, or intensive risk, based on the worker’s responses.

The RA is considered the gold standard in child welfare decision-making, developed with the goal of providing a level of standardization and predictability. It was intended to be used in conjunction with workers’ clinical judgement but designed to eliminate some of the most glaring problems with clinical decision-making, such as individual variation in the interpretation of the same set of facts, implicit bias, and lack of knowledge about empirically established risk factors.

“The RA is premised on the idea that when workers follow it, different individuals are reasonably likely to come to the same conclusion about case actions,” Bosk and Feely wrote. “No longer will outcomes be random – that is, contingent on which worker a family is assigned.”

For their study, Bosk and Feely examined the RA’s use – reviewing policies and interviewing caseworkers and their supervisors – and found drastically different applications of the assessment between the two states at the organizational level. In the first, the application of the RA has been mandated by the legislature and was used strictly and in place of clinical judgment. In the other, while the RA was always completed, it was not a significant factor in decision-making, with clinical judgment typically driving decisions.

“We call this ‘the Goldilocks problem,’ because one state essentially totally privileges the RA score,” Feely says, “so it’s a too tight interpretation of what to do with it. And in the other, most workers don’t really use it, so it’s an overly loose interpretation of what to do with it and how to integrate it into clinical judgment. There’s no middle point.”

In the so-called “tight state,” workers explained they were unable to use anything other than the RA to make case determinations, which was not the intended application of the assessment by its developers. Clinical assessment was discouraged and, because of the rigidity of the framework, some workers would intentionally circumvent the RA – changing scores to either increase or decrease the predicted risk – in order to achieve a case trajectory that better matched their otherwise disregarded clinical judgment.

By contrast, in the so-called “loose state,” workers were required to complete the RA, but it had little to no role in case decision-making, with the majority of workers relying on their clinical judgements and consultations with their supervisors to decide case trajectories. While the workers had significantly more flexibility in their decision making, the researchers found, the potentially systemizing and standardizing effect of the RA was eliminated.

The problem, Feely says, comes down to a flaw in the RA itself: While the developers intended for the RA to be used in conjunction with clinical judgment, they never provide any guidance or methodology on how to integrate the two. The propensity is to blame the workers, or the agencies, for the RA’s shortcomings, she says, but workers consistently found the tool to be problematic, and the study validates those concerns.

“Without guidance, it’s not clear how to integrate them, exactly, because it’s not another piece of more qualitative information, which we would use in clinical judgment, but a hard number,” Feely says. “We found that organizational context really matters for the application of the RA, and that because it’s not specified in the model, organizations are responsible for figuring out how to integrate the score with clinical judgment themselves.”

While that led to some workers in the “tight state” manipulating the RA, it also led workers to escalate cases involving families that, through clinical judgement, would likely not have been considered at risk. Feely said that unnecessarily high rates of child welfare involvement, particularly in marginalized communities or communities with many Black, Indigenous, or other people of color, contributes to the overall sense that the system is unfair.

“You can see how frustrating it would be if your child had autism, or was categorized as having behavioral or mental health issues, and you were on antidepressants, and then all of a sudden you’re labeled as at risk,” she says. “You can’t do anything about those things. You can’t fix them. You’re not going to go off your antidepressants, because that obviously would make it worse.”

She continues, “It feels like the conservative option is to err on the side of having more false positives, where people that are really not at risk are misidentified as at risk. But there are real downsides to that, and I think that, in child welfare, we’re seeing a sort of paralleling with some of the attention that’s on police – there are longer-term big consequences when we keep getting it wrong, because people don’t trust the system.”

While a clinical-based approach offers more nuance, she says, it also loses what could make the process more consistent. As the RA and other prognostic tools and their potential use in child welfare situations are being discussed, Feely said that this study offers a cautionary tale that should encourage policymakers to be wary of trusting a tool more than is warranted.

“A main issue is really having a more open discussion of how these sort of probability-based tools should be included into the context of clinical decision-making,” she says. “I think that the move toward trying to incorporate more evidence and a more scientific base in social services, is positive, but I think it has to be really carefully balanced with the limits of that science. Overestimating the science, and the veracity of it, and its ability to be applicable in a particular situation, can be just as problematic for families and society as under-using it.”


Primary data collection for this study was conducted by Dr. Bosk in an independent study and is unrelated to Dr. Feely’s association with the Connecticut Department of Children and Families (CT DCF). Dr. Feely’s co-authorship does not imply any association between the study data or findings and the CT DCF.