-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the use of regex - Possibly a big speed improvement #7815
Comments
Yes, this has been discussed more than once. There are two ways to solve: 1 manual result caching 1 create hash array by device-code = [ brand short ] in fact, I've been thinking about a third of the implementations -
'device_code' :
'brand' : 'short code brand' for double name device code -
'device_code' :
type: 'smarphone'
vars:
- os:
version: [8, 11]
brand: 'short code brand SH'
- os:
type: 'tablet' // or replace to int
version: [12]
brand: 'short code brand SM' there are several reasons why we haven't done this yet.
|
Is there a PR that you can refer me to? Does it include measured statistics? I would be very interested in seeing this.
Thank you for you suggestions. The indexing, is that to reduce the number of regex to run? That would make sense. |
Another option is to run it in the background, rather than when the visitor enters the website, unless you're serving device-specific content. |
install for linux
need public empty dir
microservice <?php
include __DIR__ . "/vendor/autoload.php";
use DeviceDetector\ClientHints;
use DeviceDetector\DeviceDetector;
use Spiral\RoadRunner;
use Nyholm\Psr7;
$deviceDetector = new DeviceDetector();
$worker = RoadRunner\Worker::create();
$psrFactory = new Psr7\Factory\Psr17Factory();
$worker = new RoadRunner\Http\PSR7Worker($worker, $psrFactory, $psrFactory, $psrFactory);
while ($req = $worker->waitRequest()) {
try {
$res = new Psr7\Response();
$content = $req->getBody()->getContents();
$json = json_decode($content, true);
$userAgent = $json['useragent'] ?? '';
$headers = $json['headers'] ?? [];
$clientHints = ClientHints::factory($headers);
$deviceDetector->setUserAgent($userAgent);
$deviceDetector->setClientHints($clientHints);
$deviceDetector->parse();
$os = $deviceDetector->getOs();
$client = $deviceDetector->getClient();
if ($deviceDetector->isBot()) {
$result = ['bot' => $deviceDetector->getBot()];
} else {
$result = [
'os' => $os,
'client' => $client,
'device' => [
'type' => $deviceDetector->getDeviceName(),
'brand' => $deviceDetector->getBrandName(),
'model' => $deviceDetector->getModel(),
]
];
}
$res->getBody()->write(json_encode($result));
$worker->respond($res);
} catch (\Throwable $e) {
$worker->getWorker()->error((string)$e);
}
} run create phpstorm http ### test 1
POST http://0.0.0.0:8080/
Content-Type: application/json
{
"useragent": "Mozilla/5.0 (Linux; Android 14; XT2343-1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Mobile Safari/537.36",
"headers": {}
}
### test 2
POST http://0.0.0.0:8080/
Content-Type: application/json
{
"useragent": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Mobile Safari/537.36",
"headers": {}
}
result my config base .rr.yaml version: '3'
rpc:
listen: 'tcp://127.0.0.1:6001'
server:
command: 'php app.php'
relay: pipes
http:
address: '0.0.0.0:8080'
middleware:
- gzip
- static
static:
dir: public
forbid:
- .php
- .htaccess
pool:
num_workers: 1
supervisor:
max_worker_memory: 100
jobs:
pool:
num_workers: 2
max_worker_memory: 100
consume: {}
kv:
local:
driver: memory
config:
interval: 60
metrics:
address: '127.0.0.1:2112'
|
result stress tests 3min + cache <?php
include __DIR__ . "/vendor/autoload.php";
use DeviceDetector\ClientHints;
use DeviceDetector\DeviceDetector;
use Spiral\RoadRunner;
use Nyholm\Psr7;
$deviceDetector = new DeviceDetector();
$worker = RoadRunner\Worker::create();
$psrFactory = new Psr7\Factory\Psr17Factory();
// lite ram cache
class StaticCache
{
public array $staticCache = [];
public function fetch(string $id)
{
return $this->contains($id) ? $this->staticCache[$id] : false;
}
public function contains(string $id): bool
{
return isset($this->staticCache[$id]) || \array_key_exists($id, $this->staticCache);
}
public function save(string $id, $data, int $lifeTime = 0): bool
{
$this->staticCache[$id] = $data;
return true;
}
}
$cacheLimit = 5;
$cache = new StaticCache();
$worker = new RoadRunner\Http\PSR7Worker($worker, $psrFactory, $psrFactory, $psrFactory);
while ($req = $worker->waitRequest()) {
try {
$res = new Psr7\Response();
$content = $req->getBody()->getContents();
$json = json_decode($content, true);
$userAgent = $json['useragent'] ?? '';
$headers = $json['headers'] ?? [];
$key = md5($content);
$result = $cache->fetch($key);
if (!$result) {
/*
* Cache limit array size is limit max then last item drop
*/
if (count($cache->staticCache) > $cacheLimit) {
array_pop($cache->staticCache);
}
$clientHints = ClientHints::factory($headers);
$deviceDetector->setUserAgent($userAgent);
$deviceDetector->setClientHints($clientHints);
$deviceDetector->parse();
$os = $deviceDetector->getOs();
$client = $deviceDetector->getClient();
if ($deviceDetector->isBot()) {
$result = ['bot' => $deviceDetector->getBot()];
} else {
$result = [
'os' => $os,
'client' => $client,
'device' => [
'type' => $deviceDetector->getDeviceName(),
'brand' => $deviceDetector->getBrandName(),
'model' => $deviceDetector->getModel(),
]
];
}
$cache->save($key, $result);
}
$result = json_encode($result);
$res->getBody()->write($result);
$worker->respond($res);
} catch (\Throwable $e) {
$worker->getWorker()->error((string)$e);
}
} bzt-config,yml execution:
- concurrency: 500
ramp-up: 30s
hold-for: 3m
scenario: quick-test2
scenarios:
quick-test2:
requests:
-
transaction: test1
force-parent-sample: false
do:
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (iPhone; CPU iPhone OS 17_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Mobile/15E148 Safari/604.1
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (Linux; Android 8.1.0; PSP5522DUO) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.74 Mobile Safari/537.36
-
transaction: test2
force-parent-sample: false
do:
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (Linux; Android 5.1.1; A33fw Build/LMY47V; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/71.0.3578.99 Mobile Safari/537.36
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (Linux; Android 12; HCE700) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Mobile Safari/537.36
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Mobile Safari/537.36
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (Linux; U; Android 13; en-US; 23073RPBFL Build/TKQ1.221114.001) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/78.0.3904.108 UCBrowser/13.5.8.1314 Mobile Safari/537.36
-
transaction: test3
force-parent-sample: false
do:
- url: 'http://0.0.0.0:8080'
method: POST
body:
useragent: Mozilla/5.0 (Linux; arm_64; Android 13; TECNO CK8n) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 YaBrowser/23.5.6.42.00 SA/3 Mobile Safari/537.36
img process result for 500/psec final text result
|
I have found a big bottleneck but I cannot fix this myself. Hence I am writing an issue. I am sorry if this has been discussed before or if you already are fully aware.
I see that
Mobile::matchUserAgent()
takes about 83 ms to run in my production environment, and it is due to the fact that we are callingpreg_match
1500+ times.This could be optimized by compiling all of those 1500+ regex into one MASSIVE regex and execute that once. Symfony Router did that a while back with great success.
References:
The text was updated successfully, but these errors were encountered: