homog (32b35509) · Commits · Visitlab / PB009 Study materials

public/lectures/course5.html

+125 −1

Original line number	Diff line number	Diff line
		@@ -382,9 +382,133 @@
		</div>

		<h2 id="homog">Homogenous coordinates</h2>
		We will represent a point in different coordinate system called homogenous coordinates.
		Consider a 2D plane in 3D space that does not pass through the origin $o$.
		Every line in the 3D space that passes through the origin intersects this plane at exactly one point $p$.
		Every such line can be represented by a 3D point $p'$ on the line (or simply a directional 3D vector, since we know
		that the line passes through origin).
		Hence, the 2D point $p$ can be represented by any 3D point $p'$ that lies on this line.

		<div style="margin: auto; width: 35%;">
		<figure style="margin: auto; width:100%; display: flex;">
		<img src="img/5/hom1.png" alt="" class="img-fluid" style="width: 100%; height:auto; object-fit:contain;">
		</figure>
		</div>

		This representation is called homogenous coordinates.
		The relation between the 2D point $p$ and the 3D point $p'$ can be expressed mathematically as
		$ \begin{pmatrix} p_x \\ p_y \end{pmatrix} = \begin{pmatrix} \frac{p'_x}{p'_z} \\ \frac{p'_y}{p'_z} \end{pmatrix} $
		where $p'$ is the 3D point in homogenous coordinates, and $p$ is the corresponding 2D point. In other words,
		to convert from homogenous coordinates to Cartesian coordinates, we divide the first two coordinates by the third coordinate.


		<br ><br >
		However, as a convention, we usually use letter $w$ to denote the third coordinate in homogenous coordinates instead of $z$.
		Thus the conversion of the point $p_h$ from Homogenous to Cartesian coordinates to becomes:
		$$ p_h = {\begin{pmatrix} x_h \\ y_h \\ w \end{pmatrix}}_h \to {\begin{pmatrix} \frac{x_h}{w} \\ \frac{y_h}{w} \end{pmatrix}}_c \to {\begin{pmatrix} x \\ y \end{pmatrix}}_c $$.
		It works the other way around as well. To convert from Cartesian to Homogenous coordinates, we multiply the first two coordinates by $w$:
		$$ p_c = {\begin{pmatrix} x \\ y \end{pmatrix}}_c \to {\begin{pmatrix} w x \\ w y \\ w \end{pmatrix}}_h \to {\begin{pmatrix} x_h \\ y_h \\ w \end{pmatrix}}_h $$

		Note that the choice of $w$ is arbitrary. Any non-zero value of $w$ will result in the same Cartesian coordinates after conversion.
		In the illustration below, both the point with $w=1$ and the point with $w=2$ represent the same 2D point.
		This also means that a single point in Cartesian coordinates can be represented by infinitely many points in homogenous coordinates.
		However, in computer graphics, we usually set $w$ to 1 for simplicity.
		<div style="margin: auto; width: 35%;">
		<figure style="margin: auto; width:100%; display: flex;">
		<img src="img/5/hom2.png" alt="" class="img-fluid" style="width: 100%; height:auto; object-fit:contain;">
		</figure>

		</div>

		The only exception to arbitrary choice of $w$ is $w = 0$. From the formula for conversion from homogenous to Cartesian coordinates,
		we can see that when $w\ = 0$, the conversion leads to division by zero, which essentially means that
		in Cartesian coordinates, such point lies at infinity. We can't really work with such
		points in Cartesian coordinates so having a different coordinate system where we can represent
		and distinguish them is very useful.
		Further, the points at infinity are often used to represent directions in space, rather than specific locations, so $w = 0$ is
		also used for representing vectors in homogenous coordinates.

		<br ><br >
		The ability to represent points at infinity is one of the key advantages of using homogenous coordinates in computer graphics.
		Its usefulness will become more apparent when we discuss projection transformations later in the lecture.
		For now, let's get back to affine transformations and see how homogenous coordinates help us with the issue of
		matrix representation for translation. First, let's see the original translation formula looks like using homogenous coordinates:

		$$p' = p + {\color{#38ba45}\vec{t}} \to \begin{pmatrix} p'_x \\ p'_y \\ w \end{pmatrix}
		= \begin{pmatrix} p_x \\ p_y \\ 1 \end{pmatrix} + {\color{#38ba45}\begin{pmatrix} t_x \\ t_y \\ 0\end{pmatrix}}
		= \begin{pmatrix} p_x + {\color{#38ba45}t_x}\\ p_y + {\color{#38ba45}t_y} \\ 1 + {\color{#38ba45}0} \end{pmatrix} $$

		We set $w = 1$ for points and $w = 0$ for translation vector, since it is a vector.
		Now, let's see how we can express this using matrix notation.
		$$p' = {\color{#38ba45}T(t_x, t_y)} p \to \begin{pmatrix} p'_x \\ p'_y \\ 1 \end{pmatrix}
		= {\color{#38ba45}\begin{pmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ 1 \end{pmatrix}
		= \begin{pmatrix} 1 \cdot p_x + 0 \ + {\color{#38ba45}t_x} \cdot 1 \\ 0 \cdot p_x + 1 \cdot p_y + {\color{#38ba45}t_y} \cdot 1 \\ 0 \cdot p_x + 0 \cdot p_y + 1 \cdot 1 \end{pmatrix}
		= \begin{pmatrix} p_x + {\color{#38ba45}t_x}\\ p_y + {\color{#38ba45}t_y} \\ 1 \end{pmatrix} $$

		We get the same result. With Homogenous coordinates, we can represent translation using matrix multiplication.
		Now, to be able to combine translation with other affine transformations, we just need to express them in homogenous coordinates as well.
		We want them to behave the same as before. For that to work, we simply add an extra row and column to the matrices of the other transformations
		and set the values in the last row and column to 0s, except for the bottom-right value, which we set to 1.
		<br><br>
		2D rotation in homogenous coordinates:
		$$p' = {\color{#38ba45}R} p \to \begin{pmatrix} p'_x \\ p'_y \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} \cos \alpha & -\sin \alpha & 0 \\ \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ 1 \end{pmatrix}
		= \begin{pmatrix} {\color{#38ba45}\cos \alpha} \cdot p_x {\color{#38ba45}-\sin \alpha} \cdot p_y \\ {\color{#38ba45}\sin \alpha} \cdot p_x + {\color{#38ba45}\cos \alpha} \cdot p_y \\ 1 \end{pmatrix}$$
		2D scale in homogenous coordinates:
		$$p' = {\color{#38ba45}S} p \to \begin{pmatrix} p'_x \\ p'_y \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ 1 \end{pmatrix}
		= \begin{pmatrix} {\color{#38ba45}s_x} \cdot p_x \\ {\color{#38ba45}s_y} \cdot p_y \\ 1 \end{pmatrix}$$
		2D shear in homogenous coordinates:
		$$p' = {\color{#38ba45}H} p \to \begin{pmatrix} p'_x \\ p'_y \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} 1 & a & 0 \\ b & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ 1 \end{pmatrix}
		= \begin{pmatrix} p_x + {\color{#38ba45}a} \cdot p_y \\ {\color{#38ba45}b} \cdot p_x + p_y \\ 1 \end{pmatrix}$$

		Now we can combine multiple transformations together by simply multiplying their matrices. Another advantage of having the unified representations for all operations both in 2D, as seen here, or in 3D is that we can now actually build a hardware component performing the matrix multiplication and integrate it directly to GPU.
		If we kept the separate definitions for the translation and other transformations, we would have to have some intelligent component on GPU that would decide where to send the data to perform the appropriate transformation. Also, if we would like to perform multiple transformations such as translation and rotation, we would have to copy the data to two different computation units making the whole process way slower.

		<br ><br >
		The transformation work much the same way in 3D space as well. We just need to use 4D homogenous coordinates for points and 4x4 matrices for transformations.
		The extension is trivial for translation and scaling, since we just add an extra row and column for the extra coordinates.
		<br ><br >
		3D translation in homogenous coordinates:
		$$p' = {\color{#38ba45}T(t_x, t_y, t_z)} p \to \begin{pmatrix} p'_x \\ p'_y \\ p'_z \\ 1 \end{pmatrix}
		= {\color{#38ba45}\begin{pmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix} $$
		3D scale in homogenous coordinates:
		$$p' = {\color{#38ba45}S(s_x, s_y, s_z)} p \to \begin{pmatrix} p'_x \\ p'_y \\ p'_z \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix} $$

		However, rotation in 3D is a bit more complicated, since we need to specify the axis of rotation as well as the angle.
		There are three basic rotation matrices, one for each axis. We will soon show also how to perform rotation around arbitrary axis, but for now we focus on the basic ones.
		We are mostly just flipping the signs on the sinus terms and moving the terms around. The easy trick how to recognize which axis is used for the rotation is that the corresponding row and column should have 0 everywhere except on the diagonal, which will have 1 there.
		<br ><br >
		3D rotation around x-axis in homogenous coordinates:
		$$p' = {\color{#38ba45}R_x(\alpha)} p \to \begin{pmatrix} p'_x \\ p'_y \\ p'_z \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos \alpha & -\sin \alpha & 0 \\ 0 & \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix} $$

		3D rotation around y-axis in homogenous coordinates:
		$$p' = {\color{#38ba45}R_y(\alpha)} p \to \begin{pmatrix} p'_x \\ p'_y \\ p'_z \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} \cos \alpha & 0 & \sin \alpha & 0 \\ 0 & 1 & 0 & 0 \\ -\sin \alpha & 0 & \cos \alpha & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix} $$

		3D rotation around z-axis in homogenous coordinates:
		$$p' = {\color{#38ba45}R_z(\alpha)} p \to \begin{pmatrix} p'_x \\ p'_y \\ p'_z \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} \cos \alpha & -\sin \alpha & 0 & 0 \\ \sin \alpha & \cos \alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix} $$

		When defining the rotations, one important aspect to consider is the direction of rotation.
		The rotation matrices defined above follow the right-hand rule, which means that if you curl the fingers of your right hand in the direction of rotation, your thumb will point in the positive direction of the axis of rotation.
		This is a standard convention used in mathematics and computer graphics to ensure consistency in the definition of rotation.

		<div style="margin: auto; width: 80%;">
		<figure style="margin: auto; width:100%; display: flex;">
		<img src="img/5/rh2.png" alt="" class="img-fluid" style="width: 25%; height:auto; object-fit:contain;">
		<img src="img/5/rh.png" alt="" class="img-fluid" style="width: 75%; height:auto; object-fit:contain;">
		</figure>
		<figcaption style="text-align: center; margin: auto">Right hand rule of rotation. Sources: https://www.khanacademy.org/, https://www.cs.nccu.edu.tw/~lien/BCC/VRML/trans11.htm
		</figcaption>
		</div>
		<br>
		Finally, for the Shear, we can choose the plane in which we want to skew the object.
		For the YZ plane, we will set the parameters outside the diagonal in the first column;
		for the XZ plane, we should set the parameters on the second column, and finally, for the XY plane,
		we will use the third column.
		3D shear in homogenous coordinates:
		$$p' = {\color{#38ba45}H(YZ_y, YZ_z, XZ_x, XZ_x, XY_x, XY_y)} p \to \begin{pmatrix} p'_x \\ p'_y \\ p'_z \\ 1 \end{pmatrix} = {\color{#38ba45}\begin{pmatrix} 1 & {\color{#059644}XZ_x} & {\color{#059644}XY_x} & 0 \\ {\color{#059644}YZ_y} & 1 & {\color{#059644}XY_y} & 0 \\ {\color{#059644}YZ_z} & {\color{#059644}XZ_z} & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}} \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix} $$

		<h2 id="composite">Composite transformations</h2>


		<h2 id="mvp">Model, view, projection</h2>

		</div>

public/lectures/img/5/hom1.png

0 → 100644

+17.6 KiB

Loading image diff...

public/lectures/img/5/hom2.png

0 → 100644

+27.7 KiB

Loading image diff...

public/lectures/img/5/rh.png

0 → 100644

+119 KiB

Loading image diff...

public/lectures/img/5/rh2.png

0 → 100644

+6.57 KiB

Loading image diff...