升级Amazon AWS EC2 T2实例

本文介绍Amazon AWS EC2 T2实例,Amazon CloudWatch的一些指标,如何从t2.micro升级到t2.small。

1. 写在前面

用了AWS Free Tier(t2.micro)一年,后来续费又用了一年,马上又该续费了。随着博客访问量日益增加,明显感觉到t2.micro不够用(内存和CPU),网站在访问高峰会宕掉。下图是过去30天(2017/03/29 – 2017/04/27)谷歌分析的统计结果:

google_analystics_sns_april_2017
Fig. 1 Google Analystics on Spark & Shine from March 29 to April 27, 2017

估计将来很长一段时间还是用T2,所以决定花点时间了解下T2,主要是想弄明白怎么通过一些指标来决定是否需要升级T2实例。

2. T2实例

T2实例介绍见Amazon EC2 Instance Types,摘抄一段如下:

T2 instances are Burstable Performance Instances that provide a baseline level of CPU performance with the ability to burst above the baseline. The baseline performance and ability to burst are governed by CPU Credits. Each T2 instance receives CPU Credits continuously at a set rate depending on the instance size.  T2 instances accrue CPU Credits when they are idle, and use CPU credits when they are active.  T2 instances are a good choice for workloads that don’t use the full CPU often or consistently, but occasionally need to burst (e.g. web servers, developer environments and databases). For more information see Burstable Performance Instances.

T2实例细分为7种,从最小的naro到最大的2xlarge,如下图所示,详情见这里

aws_t2_instance_type
Fig. 2 Amazon EC2 T2 Instance Types.

T2实例的基本想法是用户可以将平时没用的CPU累积起来(系统每小时给T2实例发放CPU Credits),在使用时加速CPU。

Burstable Performance Instances provide a baseline level of CPU performance with the ability to burst above the baseline.

One CPU credit is equal to one vCPU running at 100% utilization for one minute. Other combinations of vCPUs, utilization, and time are also equal to one CPU credit; for example, one vCPU running at 50% utilization for two minutes or two vCPUs running at 25% utilization for two minutes.

一个CPU Credits可以让CPU全速运行1分钟。这样的话,想24小时全速使用CPU,那就需要1个小时有60个CPU Credits。不同类型的T2实例具有不同的CPU性能基准值,比如t2.small是20%,意思是CPU利用率只有20%,也就是说每个小时发放12个credits(60 * 20%)。没使用的CPU Credits(比如系统处于空闲)可以累加(时效是24小时),当CPU高负荷时,可以使用累积的CPU Credits,从而提升CPU性能(类似于超频)。

关于CPU Credits介绍见T2 Instances CPU Credits,摘抄一段如下:

For example, a t2.small instance receives credits continuously at a rate of 12 CPU Credits per hour. This capability provides baseline performance equivalent to 20% of a CPU core. If at any moment the instance does not need the credits it receives, it stores them in its CPU Credit balance for up to 24 hours. If and when your t2.small needs to burst to more than 20% of a core, it draws from its CPU Credit balance to handle this surge seamlessly. Over time, if you find your workload needs more CPU Credits than you have, or your instance does not maintain a positive CPU Credit balance, we recommend either a larger T2 size, such as the t2.medium, or a Fixed Performance Instance type.

使用完所有累积的CPU Credits会怎么样呢,见原文:

If your instance uses all of its CPU credit balance, performance remains at the baseline performance level. If your instance is running low on credits, your instance’s CPU credit consumption (and therefore CPU performance) is gradually lowered to the base performance level over a 15-minute interval, so you will not experience a sharp performance drop-off when your CPU credits are depleted. If your instance consistently uses all of its CPU credit balance, we recommend a larger T2 size or a fixed performance instance type such as M3 or C3.

3. Amazon CloudWatch

有了以上基础后,现在来看看运行的实例究竟使用了多少资源。Amazon CloudWatch提供了丰富的监测指标EC2 > Per-Instance Metrics(在我的例子中,是23种)。详细的指标解释见List the Available CloudWatch Metrics for Your InstancesAmazon EBS Metrics and Dimensions

CPU Credits大概是T2实例最重要的概念,下面摘抄4个与CPU Credits相关的指标,并附上我实例t2.micro的数据(4周,间隔1小时,Statistic为average)。

3.1 CPU Credit Usage

The number of CPU credits consumed during the specified period (Units: Count). This metric identifies the amount of time during which physical CPUs were used for processing instructions by virtual CPUs allocated to the instance.

CPUCreditUsage
Fig. 3: CPU Credit Usage

t2.micro一个小时有6个CPU Credits,而我实例每小时平均消耗CPU Credits最高也才2,我又看了每小时消耗CPU Credits最高也才4.7。这么说t2.micro的CPU性能对我来说是够的?那么,网站偶尔挂掉多半是因为内存不足,同时访问者太多导致数据库崩溃?

3.2 CPU Credit Balance

The number of CPU credits that an instance has accumulated (Units: Count). This metric is used to determine how long an instance can burst beyond its baseline performance level at a given rate.

CPUCreditBalance
Fig. 4: CPU Credit Balance

从图上看,CPU Credits剩余在不少时间段是0。白天用得多,晚上用得少可以累积一些CPU Credits。

3.3 CPU Utilization

The percentage of allocated EC2 compute units that are currently in use on the instance (Units: Percent). This metric identifies the processing power required to run an application upon a selected instance.

Note: Depending on the instance type, tools in your operating system may show a lower percentage than CloudWatch when the instance is not allocated a full processor core.

CPUCreditUtilization
Fig. 5: CPU Credit Utilization

t2.micro的base performance (CPU Utilization)只有10%。从图中可见,需要burst的情况还挺多的。

3.4 Burst Balance

Provides information about the percentage of I/O credits (for gp2) or throughput credits (for st1 and sc1) remaining in the burst bucket (Units: Percent). Data is reported to CloudWatch only when the volume is active. If the volume is not attached, no data is reported.

Note: Used with General Purpose SSD (gp2), Throughput Optimized HDD (st1), and Cold HDD (sc1) volumes only.

BurstBalance
Fig. 6: Burst Balance

我用的是Elastic Block Storage (EBS) 的General Purpose (SSD),详情见Amazon EBS Volume Types。从图上看,我实例的I/O很少?

4. 升级T2实例

4.1 将t2.micro升级到t2.small

我决定将t2.micro升级到t2.small(主要考虑到内存,从1G到2G)。因为是预留实例(reserved instance),而且实例类型还是t2,也不更换区域,所以升级就很简单了,步骤如下:

  • 购买新的实例t2.small
  • 停止实例:Instances --> Actions --> Instance State --> Stop
  • 修改实例类型:Instances --> Instance Settings --> Change Instance Type --> t2.small
  • 启动实例:Actions --> Instance State --> Start

现在内存是2G了,如下所示:

$ free -m
             total       used       free     shared    buffers     cached
Mem:          2000        485       1514         64         21        227
-/+ buffers/cache:        236       1763
Swap:         1023          0       1023

EC2不同实例的比较,可以使用这个网站EC2Instances.info这里是一个价格比较的例子。

关于续费的一点心得:预付(upfront)可以便宜不少,全预付和部分预付(Partial Upfront)几乎是一样。3年会比1年便宜很多,但我觉得没必要一下子买三年,因为产品会越来越便宜(比如我去年买的t2.micro花了85美元,现在只需要53美元了)。综上,个人建议是1年的部分预计(Stardard 1-year Term + Partial Upfront)。

4.2 修改MPM prefork

服务器CPU和内存增加了一倍,现在可以把之前为了减少内存使用调低MPM prefork各项参数给调回来了,修改文件/etc/apache2/mods-available/mpm_prefork.conf,最后内容如下:

<IfModule mpm_prefork_module>
    StartServers            2
    MinSpareServers         2
    MaxSpareServers         10
    MaxRequestWorkers       500
    MaxConnectionsPerChild  25
</IfModule>

重启apache(service apache2 restart)使其生效。

5. 其他

5.1 查看磁盘使用情况

使用df(display free disk space)命令查看磁盘使用情况,举例如下:

ubuntu@ip-XX-XX-XX-XX:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      9.8G  6.7G  2.6G  73% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            492M   12K  492M   1% /dev
tmpfs           100M  336K   99M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            497M     0  497M   0% /run/shm
none            100M     0  100M   0% /run/user

5.2 查看内存使用情况

使用free(Display amount of free and used memory in the system)命令查看内存使用情况,举例如下(-m单位为megabyte,-h表示--human):

$ free -m 
             total       used       free     shared    buffers     cached
Mem:           992        691        301         45         74        358
-/+ buffers/cache:        258        733
Swap:         1023         99        924

$ free -h
             total       used       free     shared    buffers     cached
Mem:          992M       695M       296M        45M        74M       360M
-/+ buffers/cache:       260M       731M
Swap:         1.0G        99M       924M

值得注意的是[2]>:

Linux likes to use any extra memory to cache hard drive blocks. So you don’t want to look at just the free Mem. You want to look at the free column of the -/+ buffers/cache: row. This shows how much memory is available to applications.

所以,在我的例子中free -m,未使用内存有733 MB。

References:
[1] StackOverflow: How to safely upgrade an Amazon EC2 instance from t1.micro to large?
[2] AskUbuntu: How can I monitor the memory usage?
[3] StackOverflow: Upgrading ec2 from t2.micro to t2.medium or t2.large
[4] StackOverflow: How do you increase the max number of concurrent connections in Apache?
[5] serverfault: Optimal values for ServerLimit, MaxClients, MaxRequestsPerChild directives
[6] How to optimize apache web server for maximum concurrent connections or increase max clients in apache

发表评论

电子邮件地址不会被公开。 必填项已用*标注