Introduction to Tensors and VariablesLearning Objectives Show
IntroductionIn this notebook, we look at tensors, which are multi-dimensional arrays with a uniform type (called a dtype). You can see all supported dtypes at tf.dtypes.DType. If you're familiar with NumPy, tensors are (kind of) like np.arrays. All tensors are immutable like python numbers and strings: you can never update the contents of a tensor, only create a new one. We also look at variables, a Each learning objective will correspond to a #TODO in the notebook where you will complete the notebook cell's code before running. Refer to the solution for reference. Load necessary librariesWe will start by importing the necessary libraries for this lab.
Lab Task 1: Understand Basic and Advanced Tensor ConceptsBasicsLet's create some basic tensors. Here is a "scalar" or "rank-0" tensor . A scalar contains a single value, and no "axes".
A "vector" or "rank-1" tensor is like a list of values. A vector has 1-axis:
A "matrix" or "rank-2" tensor has 2-axes:
Tensors may have more axes, here is a tensor with 3-axes:
There are many ways you might visualize a tensor with more than 2-axes.
You can convert a tensor to a NumPy array either using
Tensors often contain floats and ints, but have many other types, including:
The base
We can do basic math on tensors, including addition, element-wise multiplication, and matrix multiplication.
Tensors are used in all kinds of operations (ops).
About shapesTensors have shapes. Some vocabulary:
Note: Although you may see reference to a "tensor of two dimensions", a rank-2 tensor does not usually describe a 2D space. Tensors and
While axes are often referred to by their indices, you should always keep track of the meaning of each. Often axes are ordered from global to local: The batch axis first, followed by spatial dimensions, and features for each location last. This way feature vectors are contiguous regions of memory.
Lab Task 2: Understand Single-Axis and Multi-Axis IndexingSingle-axis indexingTensorFlow follow standard python indexing rules, similar to indexing a list or a string in python, and the bacic rules for numpy indexing.
Indexing with a scalar removes the dimension:
Indexing with a
Multi-axis indexingHigher rank tensors are indexed by passing multiple indices. The single-axis exact same rules as in the single-axis case apply to each axis independently.
Passing an integer for each index the result is a scalar.
You can index using any combination integers and slices:
Here is an example with a 3-axis tensor:
Manipulating ShapesReshaping a tensor is of great utility. The
You can reshape a tensor into a new shape. Reshaping is fast and cheap as the underlying data does not need to be duplicated.
The data maintains it's layout in memory and a new tensor is created, with the requested shape, pointing to the same data. TensorFlow uses C-style "row-major" memory ordering, where incrementing the right-most index corresponds to a single step in memory.
If you flatten a tensor you can see what order it is laid out in memory.
Typically the only reasonable uses of For this 3x2x5 tensor, reshaping to (3x2)x5 or 3x(2x5) are both reasonable things to do, as the slices do not mix:
Reshaping will "work" for any new shape with the same total number of elements, but it will not do anything useful if you do not respect the order of the axes. Swapping axes in
You
may run across not-fully-specified shapes. Either the shape contains a Except for tf.RaggedTensor, this will only occur in the context of TensorFlow's, symbolic, graph-building APIs:
More on DTypesTo inspect a When creating a If you don't, TensorFlow chooses a datatype that can represent your data. TensorFlow converts Python integers to You can cast from type to type.
BroadcastingBroadcasting is a concept borrowed from the equivalent feature in NumPy. In short, under certain conditions, smaller tensors are "stretched" automatically to fit larger tensors when running combined operations on them. The simplest and most common case is when you attempt to multiply or add a tensor to a scalar. In that case, the scalar is broadcast to be the same shape as the other argument.
Likewise, 1-sized dimensions can be stretched out to match the other arguments. Both arguments can be stretched in the same computation. In this case a 3x1 matrix is element-wise multiplied by a 1x4 matrix to produce a 3x4 matrix. Note how the leading 1 is optional: The shape of y is
Here is the same operation without broadcasting:
Most of the time, broadcasting is both time and space efficient, as the broadcast operation never materializes the expanded tensors in memory. You see what broadcasting looks like using
Unlike a mathematical op, for example,
It can get even more complicated. This section of Jake VanderPlas's book Python Data Science Handbook shows more broadcasting tricks (again in NumPy). tf.convert_to_tensorMost ops, like Most, but not all, ops call See Ragged TensorsA tensor with variable
numbers of elements along some axis is called "ragged". Use For example, This cannot be represented as a regular tensor:
Instead create a
The shape of a
String tensors
The strings are atomic and cannot be indexed the way Python strings are. The length of the string is not one of the dimensions of the tensor. See Here is a scalar string tensor:
In the above printout the If you pass unicode characters they are utf-8 encoded.
Some basic functions with strings can be found in
Although you can't use
The Sparse tensorsSometimes, your data is sparse, like a very wide embedding space. TensorFlow supports
Lab Task 3: Introduction to VariablesA TensorFlow variable is the recommended way
to represent shared, persistent state your program manipulates. This guide covers how to create, update, and manage instances of Variables are created and tracked via the SetupThis notebook discusses variable placement. If you want to see on what device your variables are placed, uncomment this line.
Create a variableTo create a variable, provide an initial value. The
A variable looks and acts like a tensor, and, in fact, is a data structure backed by a
Most tensor operations work on variables as expected, although variables cannot be reshaped.
As noted above, variables are backed by tensors. You can reassign the tensor using
If you use a variable like a tensor in operations, you will usually operate on the backing tensor. Creating new variables from existing variables duplicates the backing tensors. Two variables will not share the same memory.
Lifecycles, naming, and watchingIn Python-based TensorFlow, Variables can also be named which can help you track and debug them. You can give two variables the same name.
Variable names are preserved when saving and loading models. By default, variables in models will acquire unique variable names automatically, so you don't need to assign them yourself unless you want to. Although variables are important for differentiation, some variables will not need to be differentiated. You can turn off gradients for a variable by setting
Placing variables and tensorsFor better performance, TensorFlow will attempt to place tensors and variables on the
fastest device compatible with its However, we can override this. In this snippet, we can place a float tensor and a variable on the CPU, even if a GPU is available. By turning on device placement logging (see Setup), we can see where the variable is placed. Note: Although manual placement works, using distribution strategies can be a more convenient and scalable way to optimize your computation. If you run this notebook on different backends with and without a GPU you will see different logging. Note that logging device placement must be turned on at the start of the session.
It's possible to set the location of a variable or tensor on one device and do the computation on another device. This will introduce delay, as data needs to be copied between the devices. You might do this, however, if you had multiple GPU workers but only want one copy of the variables.
Note: Because For more on distributed training, see our guide. Copyright 2020 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. |