One nice observation: " . . . a voice could have a higher pitch and still be perceived as male if the speaker pronounced “s” sounds in a lower frequency, which is achieved by moving the tongue farther away from the teeth." And a second: "(Vocal) resonance is lower (that is focused more in the upper chest than in their sinuses) for people whose larynx is deeper in their throats, but people learn to manipulate the position of their larynx when they’re young, with male children pulling their larynxes down a little bit and female children pushing them up . . . "
In AH-EPS, rich vocal resonance, whether perceived as more "male" or "female," is essential for effective anchoring of sounds. (That may explain why new or vibrant vocal resonance is often experienced as representative of one's new L2 identity.) Here is one of the haptic video techniques used for enhancing "both ends" of the vocal resonance range. (There is some additional touch involved that is not immediately evident in the video.)
Managing the frequency and tongue position of the standard, North American English alveolar "hissing" grooved sibilant ('s'), which helps separate it from "sh" and varieties of the sound that are considerably more fronted than in NAE, is not too difficult either, done "haptically." Notice in the video the effect of the technique in "pulling apart" 's' from 'sh.' It uses the dynamic hand gestures and sensation of aspiration "touching" the hand initially, along with lip rounding and un-rounding, to guide the tongue either up and back or down and forward in the mouth.
Does that resonante? If not, pick a different gender and do the videos again.