[[Jetson TX1]]
|SoC|Tegra X1|
|CPU|Cortex A57 4 core|
|GPU|Maxwell 128 Core|
|Memory|4GB LPDDR4|
|Storage|Micro SD|
[[Arm]][[:Arm Cortex A57]]
*スペック [#m4f9b40c]
-CPU:Cortex-A57 (1.4GHz)
-cpuinfo
$ cat /proc/cpuinfo
processor : 0
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 1
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 2
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 3
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
-lscpu
$ lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Vendor ID: ARM
Model: 1
Model name: Cortex-A57
Stepping: r1p1
CPU max MHz: 1428.0000
CPU min MHz: 102.0000
BogoMIPS: 38.40
L1d cache: 32K
L1i cache: 48K
L2 cache: 2048K
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32
-cpufreq
$ cat /sys/bus/cpu/devices/cpu?/cpufreq/cpuinfo_max_freq
1428000
1428000
1428000
1428000
-kernel
$ uname -a
Linux nvidia-nano 4.9.140-tegra #1 SMP PREEMPT Tue Jul 16 17:04:49 PDT 2019 aarch64 aarch64 aarch64 GNU/Linux
-OS
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic
-auxv
$ xxd -e -g8 /proc/self/auxv
00000000: 0000000000000021 0000007f92ef4000 !........@......
00000010: 0000000000000010 00000000000000ff ................
00000020: 0000000000000006 0000000000001000 ................
00000030: 0000000000000011 0000000000000064 ........d.......
00000040: 0000000000000003 00000055859d9040 ........@...U...
00000050: 0000000000000004 0000000000000038 ........8.......
00000060: 0000000000000005 0000000000000008 ................
00000070: 0000000000000007 0000007f92ec9000 ................
00000080: 0000000000000008 0000000000000000 ................
00000090: 0000000000000009 00000055859db1a4 ............U...
000000a0: 000000000000000b 00000000000003e8 ................
000000b0: 000000000000000c 00000000000003e8 ................
000000c0: 000000000000000d 00000000000003e8 ................
000000d0: 000000000000000e 00000000000003e8 ................
000000e0: 0000000000000017 0000000000000000 ................
000000f0: 0000000000000019 0000007fe8d6d208 ................
00000100: 000000000000001f 0000007fe8d6dfeb ................
00000110: 000000000000000f 0000007fe8d6d218 ................
00000120: 0000000000000000 0000000000000000 ................
-deviceQuery
/usr/local/cuda-10.0/samples/1_Utilities/deviceQuery/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA Tegra X1"
CUDA Driver Version / Runtime Version 10.0 / 10.0
CUDA Capability Major/Minor version number: 5.3
Total amount of global memory: 3956 MBytes (4148543488 bytes)
( 1) Multiprocessors, (128) CUDA Cores/MP: 128 CUDA Cores
GPU Max Clock rate: 922 MHz (0.92 GHz)
Memory Clock rate: 13 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 262144 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS
-gcc
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/7/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1'
--with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr
--with-gcc-major-version-only --program-suffix=-7 --program-prefix=aarch64-linux-gnu-
--enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-libquadmath --disable-libquadmath-support --enable-plugin
--enable-default-pie --with-system-zlib --enable-multiarch --enable-fix-cortex-a53-843419
--disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu
--target=aarch64-linux-gnu
Thread model: posix
gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1)
-nvcc
$ /usr/local/cuda-10.0/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Mon_Mar_11_22:13:24_CDT_2019
Cuda compilation tools, release 10.0, V10.0.326
-ccache
--nvccに対応したバージョン3.6をパッケージでインストールする¬e{ccache-3-6-deb-package-page:[[3.6-1 : ccache : arm64 : Disco (19.04) : Ubuntu>https://launchpad.net/ubuntu/disco/arm64/ccache/3.6-1]], Version 3.6, 2019-07-28閲覧};
$ sudo dpkg -i ccache_3.6-1_arm64.deb
Selecting previously unselected package ccache.
(Reading database ... 138239 files and directories currently installed.)
Preparing to unpack ccache_3.6-1_arm64.deb ...
Unpacking ccache (3.6-1) ...
Setting up ccache (3.6-1) ...
Updating symlinks in /usr/lib/ccache ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
$ ccache --version
ccache version 3.6
Copyright (C) 2002-2007 Andrew Tridgell
Copyright (C) 2009-2019 Joel Rosdahl
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 3 of the License, or (at your option) any later
version.
-cmake
--パッケージで提供されてるのが3.10.2。nvcc+ccacheに対応していて嬉しい。
$ sudo apt -y install cmake
$ cmake --version
cmake version 3.10.2
CMake suite maintained and supported by Kitware (kitware.com/cmake).
-UART Debug console
--[[115200 8N1>https://www.jetsonhacks.com/2019/04/19/jetson-nano-serial-console/]]¬e{jetson-tx1-serial-console:[[Jetson Nano - Serial Console - JetsonHacks>https://www.jetsonhacks.com/2019/04/19/jetson-nano-serial-console/]], 2019-07-30閲覧};
*その他セットアップ [#u54ffa2e]
$ ssh-keygen -t ecdsa
$ ssh-keygen -f hoge
$ cat hoge.pub >> .ssh/authorized_keys
$ sudo apt update && sudo apt upgrade
$ git config --global user.name "Tomoaki Teshima"
$ git config --global user.email "tomoaki.teshima@gmail.com"
$ time sudo apt-get -y install libgtk-3-dev openjdk-8-jre-headless
*[[Jetson TX1]]との比較 [#r3286f5e]
-OSとかCUDAとかソフト的な変更は割愛
-cpu (/proc/cpuinfo)
--evtstrm機能がCortexA57に追加されている
+ Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
- Features : fp asimd aes pmull sha1 sha2 crc32
-gpu (deviceQuery)
--きっちりコア数が半分に減っている
CUDA Capability Major/Minor version number: 5.3
+ Total amount of global memory: 3964 MBytes (4156932096 bytes)
+ ( 1) Multiprocessors, (128) CUDA Cores/MP: 128 CUDA Cores
- Total amount of global memory: 3994 MBytes (4188004352 bytes)
- ( 2) Multiprocessors, (128) CUDA Cores/MP: 256 CUDA Cores