Pose message cannot be correctly input using vision estimation

Hello all,
I’m doing an indoor drone project using PX4, with tuning it using Local Position Estimator.

I use aruco mapping as a vision position sensing system, and this function read the position of some markers, and returns the a camera pose data, and I use this data as a vision pose message to send to PX4.

the camera pose data includes position info and orientation of the camera(quaernion).
About the pose, e.g., when the camera points down it gives

transform: 
  translation: 
    x: -0.402096438144
    y: -0.478118638689
    z: 0.86537350921
  rotation: 
    x: 0.715369508281
    y: -0.698334485157
    z: -0.023942101198
    w: 0.00147961225385

and the rotation means the quaernion angle of the camera.

But if I directly map the rotation to a PoseStamped message and send it into mavros topic /vision_pose/pose,

it would make /local_position/pose’s orientation become weird , and z keeps growing , w keeps decreasing generate something like

  orientation: 
    x: 0.0102347607451
    y: -0.00718852910962
    z: 0.00984470302202
    w: -0.999873331606

and before that (when the drone lays flat), it is

  orientation: 
    x: 0.0102347607451
    y: -0.00718852910962
    z: 0.00984470302202
    w: -0.999873331606

I know that there must be something wrong with the coordinate frame transformation, but I’m not good at that so I don’t know what to do.
Anyone to help?