From your formulation it's still unclear what are the boundary conditions. But from the sentence about replacing $n\cdot \nabla u$ with $h(T_{\infty} - u)$ and your comment it sounds like your boundary conditions are $hT_{\infty} u + n\cdot \nabla u = hT_{\infty}$ so I will just handle the general case.
Suppose you have the initial boundary value problem:
\begin{alignat}{2}
\partial_t u(t,\vec{x}) &= \Delta u(t,\vec{x}), &\quad& \vec{x}\in\Omega, \, t\in (0,\infty) \\
\alpha(\vec{x})u(t,\vec{x}) + \partial_n u(t,\vec{x}) &= g(\vec{x}), &\quad& \vec{x}\in\partial\Omega, \, t\in(0,\infty), \\
u(0,\vec{x}) &= f(\vec{x}), &\quad& \vec{x}\in\Omega.
\end{alignat}
Multiplying the first equation by a test function $v$ and using the divergence theorem leads to:
\begin{equation}
\int_{\Omega} v \partial_t u = \int_{\Omega} v\Delta u = -\int_{\Omega}\nabla v \cdot \nabla u + \int_{\Omega} \nabla \cdot (v\nabla u) = -\int_{\Omega}\nabla v \cdot \nabla u + \int_{\partial\Omega} v\partial_n u.
\end{equation}
You can substitute the boundary condition:
$$\int_{\partial\Omega} v\partial_n u = \int_{\partial\Omega} v (g-\alpha u).$$
Then reordering some terms you get:
$$\int_{\Omega} v \partial_t u +\int_{\Omega}\nabla v \cdot \nabla u + \int_{\partial\Omega} v \alpha u = \int_{\partial\Omega} v g.$$
If $v$ is from a finite-dimensional space then it can be written as a linear combination of some basis vectors $\psi_1,\ldots,\psi_n$, similarly if $u$ is from a finite-dimensional space then it can be written as a linear combination of some basis vectors $\phi_1,\ldots,\phi_n$. Setting $v$ to $\psi_1,\ldots,\psi_n$ consecutively and $u(t,\vec{x}) = \sum_{j=1}^n a_j(t) \phi_j(\vec{x})$ you get for $1\leq i \leq n$:
\begin{gather}
\sum_{j=1}^n \left(\int_{\Omega} \psi_i \phi_j\right) \partial_t a_j(t) + \sum_{j=1}^n \left(\int_{\Omega} \nabla \psi_i \cdot \nabla \phi_j\right) a_j(t)
+ \sum_{j=1}^n \left(\int_{\partial\Omega} \alpha \psi_i \phi_j\right) a_j(t)= \int_{\partial\Omega} \psi_i g.
\end{gather}
The mass matrix is $M_{ij} = \int_{\Omega} \psi_i\phi_j$, the stiffness matrix is $W_{ij} = \int_{\Omega}\nabla \psi_i \cdot \nabla \phi_j$, the boundary correction matrix is $B_{ij} = \int_{\partial\Omega} \alpha \psi_i \phi_j$, the right-hand side is $b = \int_{\partial\Omega} \psi_i g$. Then you have:
$$M\partial_t a + W a + B a = b \implies \partial_t a = -M^{-1}(W+B)a + M^{-1}b.$$
Formally the solution is:
$$a(t) = \exp(-tM^{-1}(W+B))a(0) + \int_0^{t} \exp((t-s)M^{-1}(W+B)) b\,ds,$$
although the exponential may not be easy to compute (the integral can be rewritten so that it disappears and you only have a matrix exponential but I will not go into that unless you need it).
You can instead apply e.g. explicit Euler to the above:
\begin{align}
\frac{a^{k+1}-a^k}{\tau} = -M^{-1}(W+B)a^k + M^{-1}b \implies a^{k+1} = (I-\tau M^{-1}(W+B))a^k + \tau M^{-1}b,
\end{align}
or implicit Euler:
\begin{align}
M\frac{a^{k+1}-a^k}{\tau} = -(W+B)a^{k+1} + b \implies a^{k+1} = (M+\tau (W+B))^{-1}Ma^k + \tau b.
\end{align}
With implicit Euler you can take arbitrary large step sizes and it will remain stable (if the matrices $(W+B)$ and $M$ are positive semi-definite). Generally you will also have that $\psi_i = \phi_i$ which would yield symmetric positive semi-definite matrices (I am not sure how the $B$ affects that, maybe there are some conditions on the $\alpha$) allowing you to use the conjugate gradient method.
In either case from $u(0,\vec{x}) = f(\vec{x})$ you also get:
$$\int_{\Omega} v u = \int_{\Omega} v f \implies a(0) = M^{-1} \begin{bmatrix} \int_{\Omega} \psi_1 f \\ \vdots \\ \int_{\Omega} \psi_n f\end{bmatrix}.$$
So you know what $a^0$ is. Also one often makes $M$ diagonal, also known as a lumped mass matrix (see this post https://scicomp.stackexchange.com/a/19714/34261), this can for instance save resources in the inversion if you use explicit Euler. If you use lumping there you will not need to solve linear systems. On the other hand, for explicit Euler you will have some constraints on the step size $\tau$ in order to guarantee stability, e.g. $\tau < 2/\rho(M^{-1}(W+B))$.