Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Front-end caching analysis #32

Open
mominger opened this issue Apr 28, 2021 · 0 comments
Open

Front-end caching analysis #32

mominger opened this issue Apr 28, 2021 · 0 comments

Comments

@mominger
Copy link
Owner

mominger commented Apr 28, 2021

The upper part is the English version, and the lower part is the Chinese version, with the same content. If there are any wrong, or you have anything hard to understand, especially in the English version(it is modifications based on Google Translate), pls feel free to let me know.many thx.

Overview

Based on chrome Browser, analysis of front-end caching
The front-end involves DNS caching, CDN caching, HTTP caching, and Browser local caching. This article only discusses HTTP caching and Browser local caching in detail.
DNS caching: Generally, it takes about 20ms, and the Browser will cache DNS records by default for more than one minute, for reducing queries to DNS.
CDN caching: The user resolves to the CDN DNS server through the domain name, gets the CDN's load balancing server ip address, then obtains static resources nearby.

1. Why do we need to cache

  • HTTP caching in order to reduce requests or reduce response content
    why_http_cache
  • Local caching is also to reduce requests
  • local_cache

2. HTTP caching flowchart

http_flow

Service Worker is special. It actually belongs to the Browser local caching, but it also plays an important role in the HTTP caching process.
The important nodes in the flow chart are explained below

3.Service Worker

3.1 How to set

  • Write code (cache.put) by the developer to set which files to cache, set routing rules, and return directly if it matches the caching in CacheStorage, and fetch data when not match.
  • Specific writing

3.2 Caching Strategy

  • Never expire unless it is deleted via cache.delete.
  • serviceworker_limit
  • Usually >= 50M
  • Reference size limit
  • Note: Regardless of cacheStorage/member caching/ disk caching, which exceeds the size limit of the Browser, the Browser will automatically clear one part according to a certain algorithm

3.3 Examples

servicework_eg

4. Memory caching

4.1 How to set

  • Automatically set by the Browser and stored in the memory

4.2 Caching Strategy

  • When the current chrome tab is closed, it will be cleared
  • Automatically cache Get requests with the same URL, such as images/fonts/scripts, etc.
  • Note: Only no-store in the response header Cache-Control will affect the member caching not to be cached. Other any setting of Cache-Control have nothing to do with memory caching

4.3 Examples

member_cache

5. Disk caching

5.1 How to set

  • It strictly abides by the HTTP response header, is automatically set by the Browser, and stored in the hard disk

5.2 Caching Strategy

  • Clear based on HTTP response header
  • Divided into two types: mandatory caching and negotiation caching
    cache

Mandatory caching means that the client caches the response data to reduce the number of requests
Negotiation cache means that the client caches the response data, but it still needs to send a request every time, and the server needs to compare it. If the resource is modified, will return 200. If it is not modified, will return 304.
disk caching controls caching through http request header and response header, so it is also called http caching

5.2.1 Mandatory caching: Expires

Refers to how long it will expire
It is an absolute time, in seconds, it is invalid if the client changes the time
This is old and has been replaced by the relative time setting of cache-control:max-age of HTTP1.1

5.2.1.1 Example

expires_cache

5.2.2 Mandatory caching: cache-control
key description
no-cache ETag response header to inform the client (Browser, proxy server) that this resource first needs to be checked whether it has been modified on the server side, modified, response 200 and resource content, unmodified response 304
no-store Prohibited to be cached, cache will be re-requested every time
public Allow proxy servers such as CDN caching
private Do not allow proxy servers such as CDN caching
max-age The maximum effective time of the cache, the unit is s
must-revalidate If the max-age time is exceeded, send a request to the server to verify whether the resource has been modified
s-maxage Similar to max-age, it is used to set the cache time of the proxy server, the priority is higher than max-age
no-transform Proxy server is not allowed to change the file format such as pictures

These values ​​can be mixed View the priority of mixed use
no-cache almost = max-age=0, must-revalidate
Pragma before HTTP1.0 can also be set to no-cache and used in conjunction with Expire

5.2.2.1 Example

cache-control

5.2.3 Negotiation caching: Last-Modified & If-Modified-Since

Its unit is s
If the file is dynamically generated by the server, a new time will be generated even if the content has not changed. Based on this shortcoming, HTTP 1.1 supplements Etag

5.2.3.1 Step
  1. The server reponses the client of Last-Modified
  2. The Browser saves time and content to disk
  3. Next time request, take out the time and assign value to If-Modified-Since
  4. The server compares If-Modified-Since and Last-Modified
  5. Unmodified response 304, modified response 200
5.2.3.2 Example

last-modified

5.2.4 Negotiation caching: Etag & If-None-Match

Etag is the identification of the file, generally refers to the generated hash
Its process is similar to Last-Modified, the difference is that the hash value is uploaded through If-None-Match to the server for comparison
Note: If the chrome dev tool is set to Disable cache, the upload of If-None-Match will also be disabled, resulting in only response 200

5.2.4.1 Example

etag

6. Browser local caching

  • The Browser's local storage has problems with back-end data synchronization, local update, etc. It is recommended to give priority to http cache

6.1 Storage

type localStorage httponly cookie
Convenience Requires manual access by the developer, does not support pan-domain storage Browser automatic access, supports pan-domain storage
How to use Retrieve manually, add request header Cannot be obtained manually, the Browser automatically handles it

It is inconvenient to store cookies in clear text, it needs to be accessed manually, and it is not safe than httponly cookies, so it is rarely used.
sessionStorage can only be used in a page tab, which is not the same as the session technology used in the back-end. The session in the back-end stores the sessionID through cookies.

6.2 Database

  • Since cookies are generally <= 4KB, and localStorage is generally between 2.5MB and 10MB (depending on Browsers), and index queries, etc. are not supported. The Browser launched webSQL indexDB etc. databases
  • WebSQL is currently obsolete
  • IndexDB is close to NoSql database, key-value storage, generally >= 250MB.
  • IndexDB supports asynchronous operations, supports transactions, and is restricted by the JS same-origin policy.

7. HTTP cache used business scenarios

7.1 Resources that don't change often

  • E.g. picture/public library js/css, set more than 1 year cache-control:max-age=36006024*365
  • There are two types of updates: hash file name, such as the hash name file typed with webpack; plus the version number, such as ?v=xxx or ?_=xxx

7.2 Frequently changing resources

  • Set cache-control:max-age=no-cache, let the server compare with Etag

7.3 Defects

  • When a website/SPA has multiple cache files A1.js B1.css C1.css. At this time, a file becomes invalid, which will cause an error. Therefore, the business should be divided into independent function module. Try to make an independent function module into a js file, reduce the probability of it going wrong

8. Browser local cache used business scenarios

8.1 Performance optimization

  • Generally used for performance optimization, e.g. saving pictures, js, css, html templates, large amounts of data, etc.
  • However, HTTP caching should be preferred. Or HTTP cache + Browser local cache strategy, because the Browser caches data, it may cause two data sources, the Browser and the back-end, need to synchronize with the back-end data, update Browser local data, and destroy them, it's increased the complexity of data maintenance

8.2 Authentication token

  • JWT token, stored in a cookie with http-only set, or in local storage, use https communication

The following is the Chinese version, the same content as above.

概述

基于chrome浏览器,对前端缓存的分析
前端涉及 DNS缓存、CDN缓存、HTTP缓存、浏览器本地缓存。本文只详细讨论HTTP缓存和浏览器本地缓存。
DNS缓存: 一般消耗20ms左右,浏览器会默认缓存DNS记录,一分钟以上。减少对DNS的查询。
CDN缓存:用户通过域名解析到CDN的DNS服务器,拿到CDN的负载均衡服务器ip地址,就近获取静态资源。

1. 为什么需要做缓存

  • HTTP缓存 为了减少请求或缩小响应内容
    why_http_cache
  • 本地缓存 也是为了减少请求
  • local_cache

2.HTTP缓存流程图

http_flow

Service Worker比较特殊,它实际属于浏览器本地缓存,但也在HTTP缓存流程里担任了重要角色.
下面阐述流程图里的重要节点

3.Service Worker

3.1 如何设置

  • 由开发自己写代码(cache.put)设置缓存哪些文件,设置路由规则,如果匹配到 CacheStorage里的缓存,直接返回。匹配不到通过fetch 去读取。
  • 具体写法

3.2 缓存策略

  • 永不过期,除非通过 cache.delete 删除它。
  • serviceworker_limit
  • 通常 >= 50M
  • 参考大小限制
  • 注意无论 cacheStorag/member cache/ disk cache 哪一种cache,超过了浏览器的大小限制,浏览器都会自动根据某种算法去清除1部分

3.3 举例

servicework_eg

4. memory cache

4.1 如何设置

  • 由浏览器自动设置,存储在内存里

4.2 缓存策略

  • 当前chrome的tab关闭时,会被清除
  • 自动缓存url相同的Get请求,比如 图片/字体/脚本等
  • 注意 响应头Cache-Control里仅 no-store会影响member cache不缓存。其他均和 memory cache 无关

4.3 举例

member_cache

5. disk cache

5.1 如何设置

  • 它严格遵守 HTTP 响应头,由浏览器自动设置,存储在硬盘里

5.2 缓存策略

  • 根据 HTTP 响应头来清除
  • 分为强制缓存和协商缓存两种
    cache

强制缓存指客户端缓存响应数据,减少请求数
协商缓存指客户端缓存响应数据,但仍需每次发请求,需服务端进行对比,如果资源被修改了,返回200.未被修改,返回304
disk cache都是通过http请求头和响应头来控制缓存,因此,它也被成为http缓存

5.2.1 强制缓存 Expires

指在多长时间内过期
它是绝对时间,单位秒,客户端改了时间就无效
这是旧的,已被HTTP1.1 的 cache-control:max-age 相对时间设置 取代

5.2.1.1 举例

expires_cache

5.2.2 强制缓存 cache-control
key description
no-cache ETag 响应头来告知客户端(浏览器、代理服务器)这个资源首先需要被检查是否在服务端修改过,修改过,响应200和资源内容,未修改过响应304
no-store 禁止被缓存,每次都会重新请求缓存
public 允许代理服务器 如CDN 缓存
private 不允许代理服务器 如CDN 缓存
max-age 缓存最大有效时间,单位是s
must-revalidate 超过max-age时间,向服务端发送请求,验证资源是否被修改过
s-maxage 和max-age类似,它用来设置代理服务器的缓存时间,优先级比max-age 高
no-transform 不允许代理服务器 改变文件格式 如图片

这些值可以混合使用 查看混合使用的优先级
no-cache 差不多 = max-age=0, must-revalidate
HTTP1.0 前 Pragma 也可以设置 no-cache,和 Expire配合使用

5.2.2.1 举例

cache-control

5.2.3 协商缓存 Last-Modified & If-Modified-Since

它单位是s
如文件是服务器动态生成,哪怕内容没变,也会生成新的时间。基于这缺点,HTTP1.1 补充了 Etag

5.2.3.1 步骤
  1. 服务器将 Last-Modified 告知客户端
  2. 浏览器 将时间和内容 存储到disk
  3. 下次请求 将时间拿出来给 If-Modified-Since赋值
  4. 服务器比对 If-Modified-Since 和 Last-Modified
  5. 未修改 响应304,修改 响应200
5.2.3.2 举例

last-modified

5.2.4 协商缓存 Etag & If-None-Match

Etag是文件的标识,一般指生成的hash
它的流程和Last-Modified类似,不同的地方是通过If-None-Match上传hash值去服务端比对
注意: 如果 chrome dev tool 设置了 Disable cache 也会禁掉If-None-Match的上传,导致只会响应200

5.2.4.1 举例

etag

6. 浏览器本地缓存

  • 浏览器本地存储存在和后端数据同步、本地更新等问题,建议优先考虑http缓存

6.1 storage

类型 localStorage httponly cookie
便利性 需开发者手动存取,不支持泛域存储 浏览器自动存取,支持泛域存储
使用方式 手动取出,加入请求头 手动获取不到,浏览器自动处理

明文存 cookie 既不方便,需手动存取,也没httponly cookie 安全,已很少使用
sessionStorage 会话仅能在一个页面tab里使用,它和以前后台采用的session技术不是同一回事。后台的session是通过cookie存储sessionID

6.2 database

  • 由于 cookie 一般 <= 4KB,localStorage 一般在 2.5MB 到 10MB 之间(根据浏览器不同),且不支持索引查询等。浏览器推出了 webSQL indexDB等数据库
  • WebSQL目前已废弃
  • IndexDB 接近 NoSql数据库,key-value 存储。一般 >= 250MB。
  • IndexDB 支持异步操作、支持事务,受JS同源策略限制。

7. HTTP缓存 使用业务场景

7.1 不常变化的资源

  • 比如图片/公共库js/css,设置1年以上 cache-control:max-age=36006024*365
  • 更新分两种: hash文件名,比如用webpack打出的hash名文件;加版本号,如?v=xxx 或者 ?_=xxx

7.2 经常变化的资源

  • 设置cache-control:max-age=no-cache,让服务端通过Etag去对比

7.3 缺陷

  • 当一个网站/SPA存在多个缓存文件 A1.js B1.css C1.css.此时某个文件失效了,会导致出错。因此应在业务上做好独立功能的分割。尽量一个独立功能打成一个js.降低它出错的概率

8. 浏览器本地缓存 使用业务场景

8.1 性能优化

  • 一般用于性能优化,可以保存图片、js、css、html 模板、大量数据等
  • 但是,应优先采用HTTP缓存。或者HTTP缓存+浏览器本地缓存策略。因为浏览器缓存了数据,可能会造成浏览器和后台两份数据源,存在和后端数据同步,本地数据更新,销毁问题。提升了数据维护的复杂性

8.2 鉴权token

  • JWT token,存储到设置了http only的cookie里。或local storage里。采用https通信
@mominger mominger transferred this issue from another repository Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant