Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relation between kpts_3d_pred and pose #18

Open
nviolante25 opened this issue Jun 22, 2022 · 2 comments
Open

Relation between kpts_3d_pred and pose #18

nviolante25 opened this issue Jun 22, 2022 · 2 comments

Comments

@nviolante25
Copy link

Hello,
Thank you for open-sourcing this amazing project!

I have a question about the convention for the transformation of the 3D box. EgoNet only produces an egocentric pose (i.e. camera coordinates) corresponding to the rotation between the 3D box extracted from the keypoints and a template 3D box. We also have a translation corresponding to the first point in kpts_3D_pred, here.

To better understand the coordinate systems involved I'm doing the following experiment:

  1. Create a template 3D bounding box following this, in the canonical pose.
  2. Rotate it with the rotation matrix given by EgoNet, this one

After doing these two steps, I still need one translation to place the 3D box in space (in the camera system). The question is, what translation should I use? Is it the one corresponding to the first point in kpts_3d_pred?

Thank you for your time

@Nicholasli1995
Copy link
Owner

Hello, Thank you for open-sourcing this amazing project!

I have a question about the convention for the transformation of the 3D box. EgoNet only produces an egocentric pose (i.e. camera coordinates) corresponding to the rotation between the 3D box extracted from the keypoints and a template 3D box. We also have a translation corresponding to the first point in kpts_3D_pred, here.

To better understand the coordinate systems involved I'm doing the following experiment:

  1. Create a template 3D bounding box following this, in the canonical pose.
  2. Rotate it with the rotation matrix given by EgoNet, this one

After doing these two steps, I still need one translation to place the 3D box in space (in the camera system). The question is, what translation should I use? Is it the one corresponding to the first point in kpts_3d_pred?

Thank you for your time

Hi, by default the translation of the input 3D box is used for visualization

p3d_pred = np.concatenate([record['kpts_3d_before'][:, [0], :], p3d_pred], axis=1)
. When ground truth boxes are specified, the ground truth translation is used instead.

You can also play with other translation estimation paradigms. For example, use cv2.solvePnP to solve the translation with the predicted 2D keypoints from EgoNet

refined_prediction = ltr.pnp_refine(pseudo_box.T, observation, intrinsics,
.

@nviolante25
Copy link
Author

Great, thanks for the answer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants