Quantcast
Channel: Active questions tagged javascript - Stack Overflow
Viewing all articles
Browse latest Browse all 138249

regex html file href/src url pattern

$
0
0

Building an Electron app which gives you all colors of any website.

For that, the app downloads the url (like http://youtube.com) and saves it as html. Now the app reads the html file and searches for any url which links to a file which might contain a color value (rgb/rgba/#/hsl), so those files would be css,js,svg etc. Those urls are added to an array, which is used by the electron-download-manager package lateron...

eg: ["href="/main.css?v=33.1"", "src="http://somesite.com/js/regex.js""]

href=" / src=" are removed by other functions

My pattern for the url is:

/(href|src)=("|')(.*?)(\.|\/)(css|js|svg|json)(.*?)("|')/g

which just works fine, but it doesnt end matching on the closing quote symbol '/"

the match of the first example is the whole line, it contains everything after the closing quote, so the title="" is part of the url, which makes no sense

href="https://www.youtube.com/opensearch?locale=de_DE" title="YouTube"><link rel="manifest" href="/manifest.json" // matches everything until json is found

src="bla.css" // works
src='bla.css?ver=123.456' // works

Is there a regex rule which says "stop by this character"?

my rule should be:

(start with href=", url , ends with .css/.js, optional fileversion(?v=123), quote symbol)


Viewing all articles
Browse latest Browse all 138249

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>